![]() |
UNIQORN -- The Universal Neural-network Interface for Quantum Observable Readout from N-body wavefunctions
|
— VERSION 0.4 BETA —
This repository contains some python and bash scripts that implement machine learning tasks using the TensorFlow library. The code performs regression and/or classification tasks on various observables from data obtained from MCTDH-X simulations.
Currently, only single-shot images as input data are supported, but in the future using correlation functions as input will be implemented.
The quantities that can be analyzed so far are fragmentation, particle number, density distributions, the potential, and correlation functions.
Click this to download the single-shot DATA.
This data should be placed in the same folder as the code in this repository. The data is a set of 3000 random ground states (randomized double wells with a random barrier height and width, random interactions, and a random particle number in them)
As prerequisites to run the python scripts in this repository you will need (at least)
Please refer to the flowchart "workflow.pdf" for a graphical depiction of the structure of the modularized code. The UNIQORN python modules related to machine learning tasks are stored mostly in the "source" directory and the python modules and files related to the data generation with MCTDH-X are stored in the directory "MCTDHX-data_generation".
Calculations can be done using the Jupyter notebook UNIQORN.ipynb or the python script UNIQORN.py. An evaluation of the error of some observables is possible via a formula. For this purpose, the python script Error_from_formula.py can be executed. To perform the (lengthy!) check for the dependence of the neural-network-based regression of observables from input data with a varying amount of single-shot observations per input dataset, the python script Regression_Loop_NShots.py can be executed.
UNIQORN's directories contain other python modules that have different purposes.
PYTHON MODULES:
BASH SCRIPTS (mainly used to generate or import data):
Currently the code supports only supervised regression tasks. The tasks can be implemented via a multilevel perceptron (MLP) or a convolutional neural network (CNN). Some default and also some customizable models are defined in the file Models.py. Note that certain quantities such as fragmentation can only be inferred from multiple (and not a single) single-shot images. The DataPreprocessing.py function therefore assembles the input data in stacks of multiple single-shot images.
A good start is the Jupyter notebook UNIQORN.ipynb, which calls all the different modules that perform various tasks.
You can check it out by typing
jupyter notebook
in your shell. This should open a window in your web browser, from which you can navigate to the file UNIQORN.ipynb and execute it line by line. The notebook goes through the workflow explained above, i.e.
data loading -> data processing -> model choice, training and validation -> visualization.
The notebook will automatically call and run other modules such as DataPreprocessing.py, ModelTrainingAndValidation.py etc. These files should be modified only if you are a developer implementing new machine learning tasks. Moreover, UNIQORN.ipynb is to be seen as a starting point that trains, evaluates and visualizes a model for a single set of model parameters.
To choose which machine learning task to perform (e.g. switching from a regression of the particle number from single shots to a classification of fragmented/non-fragmented states from correlation functions), you need to modify the input file (python class) Input.py. This file contains all the different knobs and variables for the machine learning algorithms, including hyperparameters such as batch size or number of epochs. In here, you can also select which quantity to fit and how, and whether or not you want to load a pre-trained neural network or train it yourself, visualize the results or not etc. The input file and the role of each variable therein should be self-explanatory.
Note that the notebook produces results in line while being executed, but the corresponding figures are also saved in the various folders in the main folder "plots" for later retrieval. The paths to these files and the files themselves are named after the quantity being fitted. For example, a plot of the accuracy of the regression of the particle number from single shots in real space during 20 epochs will be saved in the folder "plots/NPAR/accuracies", with the name "Accuracies-for-REGR-of-NPAR-from-SSS-in-x-space-during-20-epochs.png"
It is a tough task to optimally configure all the possible parameters of deep learning models. However, since hyperparameter optimization is an optimization task, it can be automated. One library that provides out-of-the-box hyperparameter optimization is the HpBandSter library. See this link for details about HpBandSter. Obviously, the HpBandSter library is a prerequisite for running the code.
Currently, we only provide an HpBandSter implementation for optimizing convolutional neural networks (Set Model='custom' and ConvNet=True in Input.py). You can run the hyperparameter optimization by executing
python HyperParameterOpt.py
This will then perform an optimization which you can visualize by running
This will open plot the results of the optimization run. By clicking on the point in the plot with the lowest loss, you can find out the optimal set of hyperparameters, i.e., the result of the optimization.