docs: update descriptions of ai surrogate options and parameters

2025-12-15 20:38:23 +01:00 · 2024-10-11 12:34:43 +02:00 · 2024-10-11 12:34:43 +02:00 · 0017a20e82
commit 0017a20e82
parent f7d3a7ea65
2 changed files with 43 additions and 69 deletions
--- a/README.md
+++ b/README.md
@ -236,40 +236,55 @@ mpirun -n 4 ./poet --dht --dht-snaps=2 barite_het_rt.R barite_het.rds output

 ### Example: Preparing Environment and Running with AI surrogate

-To run the AI surrogate, you need to have a Keras installed in your Python environment.
-The implementation in POET is agnostic to the exact Keras version, but If you use a 
-pretrained model, the model file must match your Keras version. Using Keras 3 with 
-`.keras` model files is recommended. recomenden
-compilation process of POET remains the same as shown above.
+To run the AI surrogate, you need to have a Keras installed in your
+Python environment. The implementation in POET is agnostic to the exact
+Keras version, but the provided model file must match your Keras version.
+Using Keras 3 with `.keras` model files is recommended. The compilation
+process of POET remains mostly the same as shown above, but the CMake
+option `-DUSE_AI_SURROGATE=ON` must be set.

-In the following code block, the installation process on the Turing Cluster is
-shown. `miniconda` is used to create a virtual environment to install
-tensorflow/keras. Please adapt the installation process to your needs.
+To use the AI surrogate, you must declare several values in the R input
+scipt. This can be either done directly in the input script or in an
+additional file. This file can be provided by adding the file path as the
+element `ai_surrogate_input_script` to the `chemistry_setup` list in the
+R input script.

-<!-- Start an R interactive session and install the required packages: -->
+The following variables and functions must be declared:
+- `model_file_path` [*string*]: Path to the Keras model file with which 
+the AI surrogate model is initialized. 
+- `validate_predictions(predictors, prediction)` [*function*]: Returns a boolean
+vector of length `nrow(predictions)`. The output of this function defines
+which predictions are considered valid and which are rejected. Regular 
+siumulation will only be done for the rejected values, and the results
+will be added to the training data buffer of the AI surrogate model.
+Can eg. be implemented as a mass balance threshold between the predictors
+and the prediction.

-```sh
-# First, install the required R packages
-R -e "install.packages('keras3', repos='https://cloud.r-project.org/')"

-# manually create a virtual environment to install keras/python using conda, 
-# as this is somehow broken on the Turing Cluster when using the `keras::install_keras()` function
-cd poet
+The following variables and functions can be declared:
+- `batch_size` [*int*]: Batch size for the inference and training functions,
+defualts to 2560.

-# create a virtual environment in the .ai directory with python 3.11
-conda create -p ./.ai python=3.11
-conda activate ./.ai
+- `training_epochs` [*int*]: Number of training epochs with each training data
+ set, defaults to 20.
+ 
+- `training_data_size` [*int*]: Size of the training data buffer. After the
+buffer has been filled, the model starts training and removes this amount of
+data from the front of the buffer. Defaults to the size of the Field.

-# install tensorflow and keras
-pip install keras tensorflow[and-cuda]
+- `use_Keras_predictions` [*bool*]: Decides if the Keras prediction function
+should be used instead of the custom C++ implementation (Keras might be faster
+for larger models, especially on GPU). Defualts to false.

-# add conda's python path to the R environment
-# make sure to have the conda environment activated
-echo -e "RETICULATE_PYTHON=$(which python)\n" >> ~/.Renviron
-```
+- `preprocess(df, backtransform = FALSE, outputs = FALSE)` [*function*]: 
+Returns the scaled/transformed/backtransformed dataframe. The `backtransform` 
+flag signals if the current processing step is applied to data that is
+assumed to be scaled and expects backtransformed values. The `outputs`
+flag signals if the current processing step is applied to the output
+or tatget of the model. This can be used to eg. skip these processing
+steps and only scale the model input. The default implementation uses no
+transformations.

-After setup the R environment, recompile POET and you're ready to run the AI
-surrogate.

 ```sh
 cd <installation_dir>/bin
@ -319,45 +334,4 @@ important information from the OpenMPI Man Page:

 For example, on platforms that support it, the clock_gettime()
 function will be used to obtain a monotonic clock value with whatever
-precision is supported on that platform (e.g., nanoseconds).
-
-## Additional functions for the AI surrogate
-
-The AI surrogate can be activated for any benchmark and is by default
-initiated as a sequential keras model with three hidden layer of depth
-48, 96, 24 with relu activation and adam optimizer. All functions in
-`ai_surrogate_model_functions.R` can be overridden by adding custom definitions
-via an R file in the input script. This is done by adding the path to
-this file in the input script. Simply add the path as an element
-called `ai_surrogate_input_script` to the `chemistry_setup` list.
-Please use the global variable `ai_surrogate_base_path` as a base path
-when relative filepaths are used in custom funtions.
-
-**There is currently no default implementation to determine the
-validity of predicted values.** This means, that every input script
-must include an R source file with a custom function
-`validate_predictions(predictors, prediction)`. Examples for custom
-functions can be found for the barite_200 benchmark
-
-The functions can be defined as follows:
-
-`validate_predictions(predictors, prediction)`: Returns a boolean
-index vector that signals for each row in the predictions if the
-values are considered valid. Can eg. be implemented as a mass balance
-threshold between the predictors and the prediction.
-
-`initiate_model()`: Returns a keras model. Can be used to load
-pretrained models.
-
-`preprocess(df, backtransform = FALSE, outputs = FALSE)`: Returns the
-scaled/transformed/backtransformed dataframe. The `backtransform` flag
-signals if the current processing step is applied to data that's
-assumed to be scaled and expects backtransformed values. The `outputs`
-flag signals if the current processing step is applied to the output
-or tatget of the model. This can be used to eg. skip these processing
-steps and only scale the model input.
-
-`training_step (model, predictor, target, validity)`: Trains the model
-after each iteration. `validity` is the bool index vector given by
-`validate_predictions` and can eg. be used to only train on values
-that have not been valid predictions.
+precision is supported on that platform (e.g., nanoseconds).
--- a/src/poet.cpp
+++ b/src/poet.cpp
@ -428,7 +428,7 @@ static Rcpp::List RunMasterLoop(RInsidePOET &R, const RuntimeParameters &params,
      training_data_buffer_mutex.unlock();

      // Signal to training thread if training data buffer is full
-      if (training_data_buffer.y[0].size() > 2000) {
+      if (training_data_buffer.y[0].size() > params.training_data_size) {
        start_training = true;
        training_data_buffer_full.notify_one();
      }