docs: update descriptions of ai surrogate options and parameters

This commit is contained in:
straile 2024-10-11 12:34:43 +02:00
parent f7d3a7ea65
commit 0017a20e82
2 changed files with 43 additions and 69 deletions

110
README.md
View File

@ -236,40 +236,55 @@ mpirun -n 4 ./poet --dht --dht-snaps=2 barite_het_rt.R barite_het.rds output
### Example: Preparing Environment and Running with AI surrogate
To run the AI surrogate, you need to have a Keras installed in your Python environment.
The implementation in POET is agnostic to the exact Keras version, but If you use a
pretrained model, the model file must match your Keras version. Using Keras 3 with
`.keras` model files is recommended. recomenden
compilation process of POET remains the same as shown above.
To run the AI surrogate, you need to have a Keras installed in your
Python environment. The implementation in POET is agnostic to the exact
Keras version, but the provided model file must match your Keras version.
Using Keras 3 with `.keras` model files is recommended. The compilation
process of POET remains mostly the same as shown above, but the CMake
option `-DUSE_AI_SURROGATE=ON` must be set.
In the following code block, the installation process on the Turing Cluster is
shown. `miniconda` is used to create a virtual environment to install
tensorflow/keras. Please adapt the installation process to your needs.
To use the AI surrogate, you must declare several values in the R input
scipt. This can be either done directly in the input script or in an
additional file. This file can be provided by adding the file path as the
element `ai_surrogate_input_script` to the `chemistry_setup` list in the
R input script.
<!-- Start an R interactive session and install the required packages: -->
The following variables and functions must be declared:
- `model_file_path` [*string*]: Path to the Keras model file with which
the AI surrogate model is initialized.
- `validate_predictions(predictors, prediction)` [*function*]: Returns a boolean
vector of length `nrow(predictions)`. The output of this function defines
which predictions are considered valid and which are rejected. Regular
siumulation will only be done for the rejected values, and the results
will be added to the training data buffer of the AI surrogate model.
Can eg. be implemented as a mass balance threshold between the predictors
and the prediction.
```sh
# First, install the required R packages
R -e "install.packages('keras3', repos='https://cloud.r-project.org/')"
# manually create a virtual environment to install keras/python using conda,
# as this is somehow broken on the Turing Cluster when using the `keras::install_keras()` function
cd poet
The following variables and functions can be declared:
- `batch_size` [*int*]: Batch size for the inference and training functions,
defualts to 2560.
# create a virtual environment in the .ai directory with python 3.11
conda create -p ./.ai python=3.11
conda activate ./.ai
- `training_epochs` [*int*]: Number of training epochs with each training data
set, defaults to 20.
- `training_data_size` [*int*]: Size of the training data buffer. After the
buffer has been filled, the model starts training and removes this amount of
data from the front of the buffer. Defaults to the size of the Field.
# install tensorflow and keras
pip install keras tensorflow[and-cuda]
- `use_Keras_predictions` [*bool*]: Decides if the Keras prediction function
should be used instead of the custom C++ implementation (Keras might be faster
for larger models, especially on GPU). Defualts to false.
# add conda's python path to the R environment
# make sure to have the conda environment activated
echo -e "RETICULATE_PYTHON=$(which python)\n" >> ~/.Renviron
```
- `preprocess(df, backtransform = FALSE, outputs = FALSE)` [*function*]:
Returns the scaled/transformed/backtransformed dataframe. The `backtransform`
flag signals if the current processing step is applied to data that is
assumed to be scaled and expects backtransformed values. The `outputs`
flag signals if the current processing step is applied to the output
or tatget of the model. This can be used to eg. skip these processing
steps and only scale the model input. The default implementation uses no
transformations.
After setup the R environment, recompile POET and you're ready to run the AI
surrogate.
```sh
cd <installation_dir>/bin
@ -319,45 +334,4 @@ important information from the OpenMPI Man Page:
For example, on platforms that support it, the clock_gettime()
function will be used to obtain a monotonic clock value with whatever
precision is supported on that platform (e.g., nanoseconds).
## Additional functions for the AI surrogate
The AI surrogate can be activated for any benchmark and is by default
initiated as a sequential keras model with three hidden layer of depth
48, 96, 24 with relu activation and adam optimizer. All functions in
`ai_surrogate_model_functions.R` can be overridden by adding custom definitions
via an R file in the input script. This is done by adding the path to
this file in the input script. Simply add the path as an element
called `ai_surrogate_input_script` to the `chemistry_setup` list.
Please use the global variable `ai_surrogate_base_path` as a base path
when relative filepaths are used in custom funtions.
**There is currently no default implementation to determine the
validity of predicted values.** This means, that every input script
must include an R source file with a custom function
`validate_predictions(predictors, prediction)`. Examples for custom
functions can be found for the barite_200 benchmark
The functions can be defined as follows:
`validate_predictions(predictors, prediction)`: Returns a boolean
index vector that signals for each row in the predictions if the
values are considered valid. Can eg. be implemented as a mass balance
threshold between the predictors and the prediction.
`initiate_model()`: Returns a keras model. Can be used to load
pretrained models.
`preprocess(df, backtransform = FALSE, outputs = FALSE)`: Returns the
scaled/transformed/backtransformed dataframe. The `backtransform` flag
signals if the current processing step is applied to data that's
assumed to be scaled and expects backtransformed values. The `outputs`
flag signals if the current processing step is applied to the output
or tatget of the model. This can be used to eg. skip these processing
steps and only scale the model input.
`training_step (model, predictor, target, validity)`: Trains the model
after each iteration. `validity` is the bool index vector given by
`validate_predictions` and can eg. be used to only train on values
that have not been valid predictions.
precision is supported on that platform (e.g., nanoseconds).

View File

@ -428,7 +428,7 @@ static Rcpp::List RunMasterLoop(RInsidePOET &R, const RuntimeParameters &params,
training_data_buffer_mutex.unlock();
// Signal to training thread if training data buffer is full
if (training_data_buffer.y[0].size() > 2000) {
if (training_data_buffer.y[0].size() > params.training_data_size) {
start_training = true;
training_data_buffer_full.notify_one();
}