mirror of
https://git.gfz-potsdam.de/naaice/poet.git
synced 2025-12-15 20:38:23 +01:00
docs: update descriptions of ai surrogate options and parameters
This commit is contained in:
parent
f7d3a7ea65
commit
0017a20e82
110
README.md
110
README.md
@ -236,40 +236,55 @@ mpirun -n 4 ./poet --dht --dht-snaps=2 barite_het_rt.R barite_het.rds output
|
||||
|
||||
### Example: Preparing Environment and Running with AI surrogate
|
||||
|
||||
To run the AI surrogate, you need to have a Keras installed in your Python environment.
|
||||
The implementation in POET is agnostic to the exact Keras version, but If you use a
|
||||
pretrained model, the model file must match your Keras version. Using Keras 3 with
|
||||
`.keras` model files is recommended. recomenden
|
||||
compilation process of POET remains the same as shown above.
|
||||
To run the AI surrogate, you need to have a Keras installed in your
|
||||
Python environment. The implementation in POET is agnostic to the exact
|
||||
Keras version, but the provided model file must match your Keras version.
|
||||
Using Keras 3 with `.keras` model files is recommended. The compilation
|
||||
process of POET remains mostly the same as shown above, but the CMake
|
||||
option `-DUSE_AI_SURROGATE=ON` must be set.
|
||||
|
||||
In the following code block, the installation process on the Turing Cluster is
|
||||
shown. `miniconda` is used to create a virtual environment to install
|
||||
tensorflow/keras. Please adapt the installation process to your needs.
|
||||
To use the AI surrogate, you must declare several values in the R input
|
||||
scipt. This can be either done directly in the input script or in an
|
||||
additional file. This file can be provided by adding the file path as the
|
||||
element `ai_surrogate_input_script` to the `chemistry_setup` list in the
|
||||
R input script.
|
||||
|
||||
<!-- Start an R interactive session and install the required packages: -->
|
||||
The following variables and functions must be declared:
|
||||
- `model_file_path` [*string*]: Path to the Keras model file with which
|
||||
the AI surrogate model is initialized.
|
||||
- `validate_predictions(predictors, prediction)` [*function*]: Returns a boolean
|
||||
vector of length `nrow(predictions)`. The output of this function defines
|
||||
which predictions are considered valid and which are rejected. Regular
|
||||
siumulation will only be done for the rejected values, and the results
|
||||
will be added to the training data buffer of the AI surrogate model.
|
||||
Can eg. be implemented as a mass balance threshold between the predictors
|
||||
and the prediction.
|
||||
|
||||
```sh
|
||||
# First, install the required R packages
|
||||
R -e "install.packages('keras3', repos='https://cloud.r-project.org/')"
|
||||
|
||||
# manually create a virtual environment to install keras/python using conda,
|
||||
# as this is somehow broken on the Turing Cluster when using the `keras::install_keras()` function
|
||||
cd poet
|
||||
The following variables and functions can be declared:
|
||||
- `batch_size` [*int*]: Batch size for the inference and training functions,
|
||||
defualts to 2560.
|
||||
|
||||
# create a virtual environment in the .ai directory with python 3.11
|
||||
conda create -p ./.ai python=3.11
|
||||
conda activate ./.ai
|
||||
- `training_epochs` [*int*]: Number of training epochs with each training data
|
||||
set, defaults to 20.
|
||||
|
||||
- `training_data_size` [*int*]: Size of the training data buffer. After the
|
||||
buffer has been filled, the model starts training and removes this amount of
|
||||
data from the front of the buffer. Defaults to the size of the Field.
|
||||
|
||||
# install tensorflow and keras
|
||||
pip install keras tensorflow[and-cuda]
|
||||
- `use_Keras_predictions` [*bool*]: Decides if the Keras prediction function
|
||||
should be used instead of the custom C++ implementation (Keras might be faster
|
||||
for larger models, especially on GPU). Defualts to false.
|
||||
|
||||
# add conda's python path to the R environment
|
||||
# make sure to have the conda environment activated
|
||||
echo -e "RETICULATE_PYTHON=$(which python)\n" >> ~/.Renviron
|
||||
```
|
||||
- `preprocess(df, backtransform = FALSE, outputs = FALSE)` [*function*]:
|
||||
Returns the scaled/transformed/backtransformed dataframe. The `backtransform`
|
||||
flag signals if the current processing step is applied to data that is
|
||||
assumed to be scaled and expects backtransformed values. The `outputs`
|
||||
flag signals if the current processing step is applied to the output
|
||||
or tatget of the model. This can be used to eg. skip these processing
|
||||
steps and only scale the model input. The default implementation uses no
|
||||
transformations.
|
||||
|
||||
After setup the R environment, recompile POET and you're ready to run the AI
|
||||
surrogate.
|
||||
|
||||
```sh
|
||||
cd <installation_dir>/bin
|
||||
@ -319,45 +334,4 @@ important information from the OpenMPI Man Page:
|
||||
|
||||
For example, on platforms that support it, the clock_gettime()
|
||||
function will be used to obtain a monotonic clock value with whatever
|
||||
precision is supported on that platform (e.g., nanoseconds).
|
||||
|
||||
## Additional functions for the AI surrogate
|
||||
|
||||
The AI surrogate can be activated for any benchmark and is by default
|
||||
initiated as a sequential keras model with three hidden layer of depth
|
||||
48, 96, 24 with relu activation and adam optimizer. All functions in
|
||||
`ai_surrogate_model_functions.R` can be overridden by adding custom definitions
|
||||
via an R file in the input script. This is done by adding the path to
|
||||
this file in the input script. Simply add the path as an element
|
||||
called `ai_surrogate_input_script` to the `chemistry_setup` list.
|
||||
Please use the global variable `ai_surrogate_base_path` as a base path
|
||||
when relative filepaths are used in custom funtions.
|
||||
|
||||
**There is currently no default implementation to determine the
|
||||
validity of predicted values.** This means, that every input script
|
||||
must include an R source file with a custom function
|
||||
`validate_predictions(predictors, prediction)`. Examples for custom
|
||||
functions can be found for the barite_200 benchmark
|
||||
|
||||
The functions can be defined as follows:
|
||||
|
||||
`validate_predictions(predictors, prediction)`: Returns a boolean
|
||||
index vector that signals for each row in the predictions if the
|
||||
values are considered valid. Can eg. be implemented as a mass balance
|
||||
threshold between the predictors and the prediction.
|
||||
|
||||
`initiate_model()`: Returns a keras model. Can be used to load
|
||||
pretrained models.
|
||||
|
||||
`preprocess(df, backtransform = FALSE, outputs = FALSE)`: Returns the
|
||||
scaled/transformed/backtransformed dataframe. The `backtransform` flag
|
||||
signals if the current processing step is applied to data that's
|
||||
assumed to be scaled and expects backtransformed values. The `outputs`
|
||||
flag signals if the current processing step is applied to the output
|
||||
or tatget of the model. This can be used to eg. skip these processing
|
||||
steps and only scale the model input.
|
||||
|
||||
`training_step (model, predictor, target, validity)`: Trains the model
|
||||
after each iteration. `validity` is the bool index vector given by
|
||||
`validate_predictions` and can eg. be used to only train on values
|
||||
that have not been valid predictions.
|
||||
precision is supported on that platform (e.g., nanoseconds).
|
||||
@ -428,7 +428,7 @@ static Rcpp::List RunMasterLoop(RInsidePOET &R, const RuntimeParameters ¶ms,
|
||||
training_data_buffer_mutex.unlock();
|
||||
|
||||
// Signal to training thread if training data buffer is full
|
||||
if (training_data_buffer.y[0].size() > 2000) {
|
||||
if (training_data_buffer.y[0].size() > params.training_data_size) {
|
||||
start_training = true;
|
||||
training_data_buffer_full.notify_one();
|
||||
}
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user