feat: manipulate ai_surrogate_species

test: remove species for ai
refactor: be more consistent with int and size_t
2025-12-16 04:48:23 +01:00 · 2024-11-02 17:08:22 +01:00 · 2024-11-01 19:56:00 +01:00 · 2024-10-31 19:56:14 +01:00 · 2024-10-30 16:51:20 +01:00 · 2024-10-30 16:48:00 +01:00
18 changed files with 1590 additions and 347 deletions
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@ -16,7 +16,6 @@ list(APPEND CMAKE_MODULE_PATH "${POET_SOURCE_DIR}/CMake")
 get_poet_version()
 find_package(MPI REQUIRED)
 find_package(RRuntime REQUIRED)
 add_subdirectory(src)
--- a/README.md
+++ b/README.md
@ -98,7 +98,12 @@ following available options:
  slowed down significantly. Defaults to _OFF_.
 - **POET_PREPROCESS_BENCHS**=*boolean* - enables the preprocessing of
  predefined models/benchmarks. Defaults to *ON*.
-  
+- **USE_AI_SURROGATE**=*boolean* - includes the functions of the AI
  surrogate model. When active, CMake relies on `find_package()` to find
  an a implementation of `Threads` and a Python environment where Numpy
  and Keras need to be installed. Defaults to _OFF_.
 ### Example: Build from scratch
 Assuming that only the C/C++ compiler, MPI libraries, R runtime
@ -231,37 +236,78 @@ mpirun -n 4 ./poet --dht --dht-snaps=2 barite_het_rt.R barite_het.rds output
 ### Example: Preparing Environment and Running with AI surrogate
-To run the AI surrogate, you need to install the R package `keras3`. The
+To run the AI surrogate, you need to have a Keras installed in your
-compilation process of POET remains the same as shown above.
+Python environment. The implementation in POET is agnostic to the exact
 Keras version, but the provided model file must match your Keras version.
 Using Keras 3 with `.keras` model files is recommended. The compilation
 process of POET remains mostly the same as shown above, but the CMake
 option `-DUSE_AI_SURROGATE=ON` must be set.
-In the following code block, the installation process on the Turing Cluster is
+To use the AI surrogate, you must declare several values in the R input
-shown. `miniconda` is used to create a virtual environment to install
+script. This can be either done directly in the input script or in an
-tensorflow/keras. Please adapt the installation process to your needs.
+additional file. This file can be provided by adding the file path as the
 element `ai_surrogate_input_script` to the `chemistry_setup` list in the
 R input script.
-<!-- Start an R interactive session and install the required packages: -->
+The following variables and functions must be declared:
 - `model_file_path` [*string*]: Path to the Keras model file with which 
 the AI surrogate model is initialized. 
-```sh
+- `validate_predictions(predictors, prediction)` [*function*]: Returns a
-# First, install the required R packages
+boolean vector of length `nrow(predictions)`. The output of this function
-R -e "install.packages('keras3', repos='https://cloud.r-project.org/')"
+defines which predictions are considered valid and which are rejected. 
 the predictors and predictions are passed in their original original (not
 transformed) scale. Regular simulation will only be done for the rejected
 values. The input data of the rejected rows and the respective true results
 from simulation will be added to the training data buffer of the AI surrogate
 model. Can eg. be implemented as a mass balance threshold between the
 predictors and the prediction.
 # manually create a virtual environment to install keras/python using conda, 
 # as this is somehow broken on the Turing Cluster when using the `keras::install_keras()` function
 cd poet
-# create a virtual environment in the .ai directory with python 3.11
+The following variables and functions can be declared:
-conda create -p ./.ai python=3.11
+- `batch_size` [*int*]: Batch size for the inference and training functions,
-conda activate ./.ai
+defaults to 2560.
-# install tensorflow and keras
+- `training_epochs` [*int*]: Number of training epochs with each training data
-pip install keras tensorflow[and-cuda]
+ set, defaults to 20.
 - `training_data_size` [*int*]: Size of the training data buffer. After
 the buffer has been filled, the model starts training and removes this amount
 of data from the front of the buffer. Defaults to the size of the Field.
-# add conda's python path to the R environment
+- `use_Keras_predictions` [*bool*]: Decides if the Keras prediction function
-# make sure to have the conda environment activated
+should be used instead of the custom C++ implementation (Keras might be faster
-echo -e "RETICULATE_PYTHON=$(which python)\n" >> ~/.Renviron
+for larger models, especially on GPU). Defaults to false.
-```
+
 - `use_k_means_clustering` [*bool*]: Decides if the K-Means clustering function
 will be used to separate the field in a reactive and a non-reactive cluster.
 Training and inference will be done with separate models for each cluster.
 Defaults to false.
 - `model_reactive_file_path` [*string*]: Path to the Keras model file with
 which the AI surrogate model for the reactive cluster is initialized. If 
 ommitted, the models for both clusters will be initialized from
 `model_file_path`
 - `disable_training` [*bool*]: Deactivates the training functions. Defaults to
 false.
 - `train_only_invalid` [*bool*]: Use only the data from PHREEQC for training
 instead of the whole field (which might contain the models own predictions).
 Defaults to false. 
 - `save_model_path` [*string*]: After each training step the current model
 is saved to this path as a .keras file.
 - `preprocess(df)` [*function*]: 
 Returns the scaled/transformed data frame. The default implementation uses no
 scaling or transformations.
 - `postprocess(df)` [*function*]: 
 Returns the rescaled/backtransformed data frame. The combination of preprocess()
 and postprocess() is expected to be idempotent. The default implementation uses
 no scaling or transformations.
 After setup the R environment, recompile POET and you're ready to run the AI
 surrogate.
 ```sh
 cd <installation_dir>/bin
@ -311,45 +357,4 @@ important information from the OpenMPI Man Page:
 For example, on platforms that support it, the clock_gettime()
 function will be used to obtain a monotonic clock value with whatever
-precision is supported on that platform (e.g., nanoseconds).
+precision is supported on that platform (e.g., nanoseconds).
 ## Additional functions for the AI surrogate
 The AI surrogate can be activated for any benchmark and is by default
 initiated as a sequential keras model with three hidden layer of depth
 48, 96, 24 with relu activation and adam optimizer. All functions in
 `ai_surrogate_model.R` can be overridden by adding custom definitions
 via an R file in the input script. This is done by adding the path to
 this file in the input script. Simply add the path as an element
 called `ai_surrogate_input_script` to the `chemistry_setup` list.
 Please use the global variable `ai_surrogate_base_path` as a base path
 when relative filepaths are used in custom funtions.
 **There is currently no default implementation to determine the
 validity of predicted values.** This means, that every input script
 must include an R source file with a custom function
 `validate_predictions(predictors, prediction)`. Examples for custom
 functions can be found for the barite_200 benchmark
 The functions can be defined as follows:
 `validate_predictions(predictors, prediction)`: Returns a boolean
 index vector that signals for each row in the predictions if the
 values are considered valid. Can eg. be implemented as a mass balance
 threshold between the predictors and the prediction.
 `initiate_model()`: Returns a keras model. Can be used to load
 pretrained models.
 `preprocess(df, backtransform = FALSE, outputs = FALSE)`: Returns the
 scaled/transformed/backtransformed dataframe. The `backtransform` flag
 signals if the current processing step is applied to data that's
 assumed to be scaled and expects backtransformed values. The `outputs`
 flag signals if the current processing step is applied to the output
 or tatget of the model. This can be used to eg. skip these processing
 steps and only scale the model input.
 `training_step (model, predictor, target, validity)`: Trains the model
 after each iteration. `validity` is the bool index vector given by
 `validate_predictions` and can eg. be used to only train on values
 that have not been valid predictions.
--- a/R_lib/ai_surrogate_model.R
+++ b/R_lib/ai_surrogate_model.R
@ -1,75 +0,0 @@
 ## This file contains default function implementations for the ai surrogate.
 ## To load pretrained models, use pre-/postprocessing or change hyperparameters
 ## it is recommended to override these functions with custom implementations via
 ## the input script. The path to the R-file containing the functions mus be set
 ## in the variable "ai_surrogate_input_script". See the barite_200.R file as an
 ## example and the general README for more information.
 require(keras3)
 require(tensorflow)
 initiate_model <- function() {
  hidden_layers <- c(48, 96, 24)
  activation <- "relu"
  loss <- "mean_squared_error"
  input_length <- length(ai_surrogate_species)
  output_length <- length(ai_surrogate_species)
  ## Creates a new sequential model from scratch
  model <- keras_model_sequential()
  ## Input layer defined by input data shape
  model %>% layer_dense(units = input_length,
                        activation = activation,
                        input_shape = input_length,
                        dtype = "float32")
  for (layer_size in hidden_layers) {
    model %>% layer_dense(units = layer_size,
                          activation = activation,
                          dtype = "float32")
  }
  ## Output data defined by output data shape
  model %>% layer_dense(units = output_length,
                        activation = activation,
                        dtype = "float32")
  model %>% compile(loss = loss,
                    optimizer = "adam")
  return(model)
 }
 gpu_info <- function() {
  msgm(tf_gpu_configured())
 }
 prediction_step <- function(model, predictors) {
  prediction <- predict(model, as.matrix(predictors))
  colnames(prediction) <- colnames(predictors)
  return(as.data.frame(prediction))
 }
 preprocess <- function(df, backtransform = FALSE, outputs = FALSE) {
  return(df)
 }
 postprocess <- function(df, backtransform = TRUE, outputs = TRUE) {
  return(df)
 }
 set_valid_predictions <- function(temp_field, prediction, validity) {
  temp_field[validity == 1, ] <- prediction[validity == 1, ]
  return(temp_field)
 }
 training_step <- function(model, predictor, target, validity) {
  msgm("Training:")
  x <- as.matrix(predictor)
  y <- as.matrix(target[colnames(x)])
  model %>% fit(x, y)
  model %>% save_model_tf(paste0(out_dir, "/current_model.keras"))
 }
--- a/R_lib/ai_surrogate_model_functions.R
+++ b/R_lib/ai_surrogate_model_functions.R
@ -0,0 +1,30 @@
 ## This file contains default function implementations for the ai surrogate.
 ## To use pre-/postprocessing it is recommended to override these functions
 ## with custom implementations via the input script. The path to the R-file
 ## See the barite_50.R file as an example and the general README for more
 ## information.
 preprocess <- function(df) {
  return(df)
 }
 postprocess <- function(df) {
  return(df)
 }
 set_valid_predictions <- function(temp_field, prediction, validity) {
  temp_field[validity == 1, ] <- prediction[validity == 1, ]
  return(temp_field)
 }
 get_invalid_values <- function(df, validity) {
  return(df[validity == 0, ])
 }
 set_field <- function(temp_field, columns, rows, column_name_limit,
                      byrow = FALSE) {
  temp_field <- matrix(temp_field, nrow = rows, byrow = byrow)
  temp_field <- setNames(data.frame(temp_field), columns)
  temp_field <- temp_field[column_name_limit]
  return(temp_field)
 }
--- a/R_lib/kin_r_library.R
+++ b/R_lib/kin_r_library.R
@ -76,9 +76,10 @@ master_iteration_end <- function(setup, state_T, state_C) {
            state_C <- data.frame(state_C, check.names = FALSE)
            ai_surrogate_info <- list(
-                prediction_time = if (exists("ai_prediction_time")) as.integer(ai_prediction_time) else NULL,
+                prediction_time = if (exists("ai_prediction_time")) ai_prediction_time else NULL,
-                training_time = if (exists("ai_training_time")) as.integer(ai_training_time) else NULL,
+                predictions_validity = if (exists("validity_vector")) validity_vector else NULL,
-                valid_predictions = if (exists("validity_vector")) validity_vector else NULL
+                cluster_labels = if (exists("cluster_labels")) cluster_labels else NULL,
                predictions = if (exists("predictions")) predictions else NULL
            )
            SaveRObj(x = list(
--- a/bench/barite/barite_200.R
+++ b/bench/barite/barite_200.R
@ -47,8 +47,7 @@ dht_species <- c(
 )
 chemistry_setup <- list(
-  dht_species = dht_species,
+  dht_species = dht_species
  ai_surrogate_input_script = "./barite_200ai_surrogate_input_script.R"
 )
 # Define a setup list for simulation configuration
--- a/bench/barite/barite_200ai_surrogate_input_script.R
+++ b/bench/barite/barite_200ai_surrogate_input_script.R
@ -1,48 +0,0 @@
 ## load a pretrained model from tensorflow file
 ## Use the global variable "ai_surrogate_base_path" when using file paths
 ## relative to the input script
 initiate_model <- function() {
  init_model <- normalizePath(paste0(ai_surrogate_base_path,
                                     "model_min_max_float64.keras"))
  return(load_model_tf(init_model))
 }
 scale_min_max <- function(x, min, max, backtransform) {
  if (backtransform) {
    return((x * (max - min)) + min)
  } else {
    return((x - min) / (max - min))
  }
 }
 preprocess <- function(df, backtransform = FALSE, outputs = FALSE) {
  minmax_file <- normalizePath(paste0(ai_surrogate_base_path,
                                      "min_max_bounds.rds"))
  global_minmax <- readRDS(minmax_file)
  for (column in colnames(df)) {
    df[column] <- lapply(df[column],
                         scale_min_max,
                         global_minmax$min[column],
                         global_minmax$max[column],
                         backtransform)
  }
  return(df)
 }
 mass_balance <- function(predictors, prediction) {
  dBa <- abs(prediction$Ba + prediction$Barite -
             predictors$Ba - predictors$Barite)
  dSr <- abs(prediction$Sr + prediction$Celestite -
             predictors$Sr - predictors$Celestite)
  return(dBa + dSr)
 }
 validate_predictions <- function(predictors, prediction) {
  epsilon <- 3e-5
  mb <- mass_balance(predictors, prediction)
  msgm("Mass balance mean:", mean(mb))
  msgm("Mass balance variance:", var(mb))
  msgm("Rows where mass balance meets threshold", epsilon, ":",
       sum(mb < epsilon))
  return(mb < epsilon)
 }
--- a/bench/barite/barite_50ai.R
+++ b/bench/barite/barite_50ai.R
--- a/bench/barite/barite_50ai_rt.R
+++ b/bench/barite/barite_50ai_rt.R
--- a/bench/barite/barite_50ai_surr_mdl.R
+++ b/bench/barite/barite_50ai_surr_mdl.R
@ -1,69 +1,148 @@
 ## Time-stamp: "Last modified 2024-05-30 13:27:06 delucia"
 ## load a pretrained model from tensorflow file
 ## Use the global variable "ai_surrogate_base_path" when using file paths
 ## relative to the input script
-initiate_model <- function() {
+model_file_path <- normalizePath(paste0(ai_surrogate_base_path,
-    require(keras3)
+                                        "barite_50ai_all.keras"))
-    require(tensorflow)
+
-    init_model <- normalizePath(paste0(ai_surrogate_base_path,
+batch_size <- 1280
-                                       "barite_50ai_all.keras"))
+training_epochs <- 20
-    Model <- keras3::load_model(init_model)
+save_model_path <- "current_model.keras"
    msgm("Loaded model:")
    print(str(Model))
    return(Model)
 }
 scale_min_max <- function(x, min, max, backtransform) {
-    if (backtransform) {
+  if (backtransform) {
-        return((x * (max - min)) + min)
+    return((x * (max - min)) + min)
-    } else {
+  } else {
-        return((x - min) / (max - min))
+    return((x - min) / (max - min))
-    }
+  }
 }
-minmax <- list(min = c(H = 111.012433592824, O = 55.5062185549492, Charge = -3.1028354471876e-08, 
+## Apply decimal logarithms handling negative values
-                       Ba = 1.87312878574393e-141, Cl = 0, `S(6)` = 4.24227510643685e-07, 
+Safelog <- function (vec) {
-                       Sr = 0.00049382996130541, Barite = 0.000999542409828586, Celestite = 0.244801877115968),
+  rv <- range(vec)
-               max = c(H = 111.012433679682, O = 55.5087003521685, Charge = 5.27666636082035e-07, 
+  if (max(abs(rv)) < 1) {
-                       Ba = 0.0908849779513762, Cl = 0.195697626449355, `S(6)` = 0.000620774752665846, 
+    ret <- vec
-                       Sr = 0.0558680070692722, Barite = 0.756779139057097, Celestite = 1.00075422160624
+    ret[vec == 0] <- 0
-                       ))
+    ret[vec > 0] <- log10(vec[vec > 0])
    ret[vec < 0] <- -log10(-vec[vec < 0])
  } else {
    ret <- log10(abs(vec))
  }
  ret
 }
 Safeexp <- function (vec) {
  ret <- vec
  ret[vec == 0] <- 0
  ret[vec > 0] <- -10^(-vec[vec > 0])
  ret[vec < 0] <- 10^(vec[vec < 0])
  ret
 }
 ##' @title Apply transformations to cols of a data.frame
 ##' @param df named data.frame
 ##' @param tlist list of function names
 ##' @return data.frame with the columns specified in tlist and the
 ##'     transformed numerical values
 ##' @author MDL
 TransfList <- function (df, tlist) {
  vars <- intersect(colnames(df), names(tlist))
  lens <- sapply(tlist[vars], length)
  n <- max(lens)
  filledlist <- lapply(tlist[vars],
                       function(x) 
                         if (length(x) < n)
                           return(c(x, rep("I", n - length(x))))
                         else
                           return(x))
  tmp <- df[, vars]
  for (i in seq_len(n)) {
    tmp <- as.data.frame(sapply(vars, function(x)
                                  do.call(filledlist[[x]][i], list(tmp[, x]))))
  }
  return(tmp)
 }
 ##' This function applies some specified string substitution such as
 ##' 's/log/exp/' so that from a list of "forward" transformation
 ##' functions it computes a "backward" one
 ##' @title Apply back transformations to cols of a data.frame
 ##' @param df named data.frame
 ##' @param tlist list of function names
 ##' @return data.frame with the columns specified in tlist and the
 ##'     backtransformed numerical values
 ##' @author MDL
 BackTransfList <- function (df, tlist) {
  vars <- intersect(colnames(df), names(tlist))
  lens <- sapply(tlist[vars], length)
  n <- max(lens)
  filledlist <- lapply(tlist[vars],
                       function(x)
                         if (length(x) < n)
                           return(c(x, rep("I", n - length(x))))
                         else
                           return(x))
  rlist <- lapply(filledlist, rev)
  rlist <- lapply(rlist, sub, pattern = "log", replacement = "exp")
  rlist <- lapply(rlist, sub, pattern = "1p", replacement = "m1")
  rlist <- lapply(rlist, sub, pattern = "Mul", replacement = "Div")
  tmp <- df[, vars]
  for (i in seq_len(n)) {
    tmp <- as.data.frame(sapply(vars, function(x)
                                  do.call(rlist[[x]][i], list(tmp[, x]))))
  }
  return(tmp)
 }
 tlist <- list("H" = "log1p", "O" = "log1p", "Charge" = "Safelog",
              "Ba" = "log1p", "Cl" = "log1p", "S(6)" = "log1p",
              "Sr" = "log1p", "Barite" = "log1p", "Celestite" = "log1p")
 minmaxlog <- list(min = c(H = 4.71860987935512, O = 4.03435069501537,
                          Charge = -14.5337693764617, Ba = 1.87312878574393e-141,
                          Cl = 0, `S(6)` = 4.2422742065922e-07,
                          Sr = 0.00049370806741832, Barite = 0.000999043199940672,
                          Celestite = 0.218976382406766),
                  max = c(H = 4.71860988013054, O = 4.03439461483133,
                          Charge = 12.9230752782909, Ba = 0.086989273200771,
                          Cl = 0.178729802869381, `S(6)` = 0.000620582151722819,
                          Sr = 0.0543631841661421, Barite = 0.56348209786429,
                          Celestite = 0.69352422027466))
 preprocess <- function(df) {
-    if (!is.data.frame(df))
+  if (!is.data.frame(df))
-        df <- as.data.frame(df, check.names = FALSE)
+    df <- as.data.frame(df, check.names = FALSE)
-    
+  tlog <- TransfList(df, tlist)
-    as.data.frame(lapply(colnames(df),
+  as.data.frame(lapply(colnames(tlog),
-                         function(x) scale_min_max(x=df[x],
+                        function(x) scale_min_max(x = tlog[x], 
-                                                   min=minmax$min[x],
+                                                  min = minmaxlog$min[x],
-                                                   max=minmax$max[x],
+                                                  max = minmaxlog$max[x],
-                                                   backtransform=FALSE)),
+                                                  backtransform = FALSE)),
-                  check.names = FALSE)
+                check.names = FALSE)
 }
 postprocess <- function(df) {
-    if (!is.data.frame(df))
+  if (!is.data.frame(df))
-        df <- as.data.frame(df, check.names = FALSE)
+    df <- as.data.frame(df, check.names = FALSE)
-    
+  ret <- as.data.frame(lapply(colnames(df),
-    as.data.frame(lapply(colnames(df),
+                              function(x) scale_min_max(x = df[x],
-                         function(x) scale_min_max(x=df[x],
+                                                        min = minmaxlog$min[x],
-                                                   min=minmax$min[x],
+                                                        max = minmaxlog$max[x],
-                                                   max=minmax$max[x],
+                                                        backtransform = TRUE)),
-                                                   backtransform=TRUE)),
+                       check.names = FALSE)
-                  check.names = FALSE)
+
  BackTransfList(ret, tlist)
 }
-mass_balance <- function(predictors, prediction) {
+mass_balance <- function(x, y) {
-    dBa <- abs(prediction$Ba + prediction$Barite -
+  dBa <- abs(y$Ba + y$Barite -
-               predictors$Ba - predictors$Barite)
+             (x$Ba + x$Barite))
-    dSr <- abs(prediction$Sr + prediction$Celestite -
+  dSr <- abs(y$Sr + y$Celestite -
-               predictors$Sr - predictors$Celestite)
+             (x$Sr + x$Celestite))
-    return(dBa + dSr)
+  return(dBa + dSr)
 }
 validate_predictions <- function(predictors, prediction) {
-    epsilon <- 1E-7
+    epsilon <- 1E-5
    mb <- mass_balance(predictors, prediction)
    msgm("Mass balance mean:", mean(mb))
    msgm("Mass balance variance:", var(mb))
@ -72,19 +151,3 @@ validate_predictions <- function(predictors, prediction) {
         sum(ret))
    return(ret)
 }
 training_step <- function(model, predictor, target, validity) {
  msgm("Starting incremental training:")
  ## x <- as.matrix(predictor)
  ## y <- as.matrix(target[colnames(x)])
  history <- model %>% keras3::fit(x = data.matrix(predictor),
                                   y = data.matrix(target),
                                   epochs = 10, verbose=1)
  keras3::save_model(model,
                     filepath = paste0(out_dir, "/current_model.keras"),
                     overwrite=TRUE)
  return(model)
 }
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@ -16,17 +16,69 @@ target_sources(POETLib
  Chemistry/SurrogateModels/HashFunctions.cpp
  Chemistry/SurrogateModels/InterpolationModule.cpp
  Chemistry/SurrogateModels/ProximityHashTable.cpp
 ) 
 target_include_directories(
 POETLib PUBLIC
 "${CMAKE_CURRENT_SOURCE_DIR}"
 "${CMAKE_CURRENT_BINARY_DIR}"
 )
-target_include_directories(POETLib PUBLIC "${CMAKE_CURRENT_SOURCE_DIR}")
+option(USE_AI_SURROGATE "Compiles the AI surrogate functions with Python.h integration" OFF)
 if(USE_AI_SURROGATE)
  add_definitions(-DUSE_AI_SURROGATE)
  message("Building with AI surrogate functions")
  message(" -- Needs Python (with Numpy & Keras) and Threads libraries")
  find_package(CUDA)
  IF (${CUDA_FOUND})
    message(" -- Setting TensorFlow CUDA path to: ${CUDA_TOOLKIT_ROOT_DIR}")
  ELSE (${CUDA_FOUND})
  message(" -- No CUDA installation found for TensorFlow  ")
  ENDIF (${CUDA_FOUND})
  # make sure to use the python installation from the conda environment
  if(DEFINED ENV{CONDA_PREFIX})
    set(Python3_EXECUTABLE "$ENV{CONDA_PREFIX}/bin/python3")
  endif()
  find_package(Python3 COMPONENTS Interpreter Development NumPy REQUIRED)
  find_package(Threads REQUIRED)
  set_source_files_properties(
    Chemistry/SurrogateModels/AI_functions.cpp
    PROPERTIES COMPILE_FLAGS "-O3 -march=native -mtune=native"
  )
  target_sources(POETLib
   PRIVATE
   Chemistry/SurrogateModels/AI_functions.cpp
  )
  target_include_directories(POETLib PUBLIC
   "${Python3_INCLUDE_DIRS}"
   "${Python3_NumPy_INCLUDE_DIRS}"
  )
  target_link_libraries(POETLib
   PUBLIC "${Python3_LIBRARIES}"
   PUBLIC Threads::Threads pthread
  )
 endif() # USE_AI_SURROGATE
 target_link_libraries(
-  POETLib 
+ POETLib
-  PUBLIC RRuntime
+ PUBLIC RRuntime
-  PUBLIC IPhreeqcPOET
+ PUBLIC IPhreeqcPOET
-  PUBLIC tug
+ PUBLIC tug
-  PUBLIC MPI::MPI_C
+ PUBLIC MPI::MPI_C
 )
 # Define Python API version
 target_compile_definitions(POETLib PRIVATE NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION)
 include(FetchContent)
 FetchContent_Declare(
  cli11
@ -80,10 +132,11 @@ target_compile_definitions(POETLib PUBLIC STRICT_R_HEADERS OMPI_SKIP_MPICXX)
 file(READ "${PROJECT_SOURCE_DIR}/R_lib/kin_r_library.R" R_KIN_LIB )
 file(READ "${PROJECT_SOURCE_DIR}/R_lib/init_r_lib.R" R_INIT_LIB)
-file(READ "${PROJECT_SOURCE_DIR}/R_lib/ai_surrogate_model.R" R_AI_SURROGATE_LIB)
+file(READ "${PROJECT_SOURCE_DIR}/R_lib/ai_surrogate_model_functions.R" R_AI_SURROGATE_LIB)
 configure_file(poet.hpp.in poet.hpp @ONLY)
 add_executable(poet poet.cpp)
 target_link_libraries(poet PRIVATE POETLib MPI::MPI_C RRuntime CLI11::CLI11)
 target_include_directories(poet PRIVATE "${CMAKE_CURRENT_BINARY_DIR}")
--- a/src/Chemistry/SurrogateModels/AI_Python_functions/keras_AI_surrogate.py
+++ b/src/Chemistry/SurrogateModels/AI_Python_functions/keras_AI_surrogate.py
@ -0,0 +1,49 @@
 import tensorflow as tf
 import numpy as np
 from sklearn.cluster import KMeans
 import os
 os.environ["TF_XLA_FLAGS"] = "--tf_xla_cpu_global_jit"
 os.environ["XLA_FLAGS"] = "--xla_gpu_cuda_data_dir=" + cuda_dir
 def k_means(data, k=2, tol=1e-6):
    kmeans = KMeans(n_clusters=k, tol=tol)
    labels = kmeans.fit_predict(data)
    return labels
 def initiate_model(model_file_path):
    print("AI: Model loaded from: " + model_file_path, flush=True)
    model = tf.keras.models.load_model(model_file_path)
    return model
 def prediction_step(model, model_reactive, x, cluster_labels, batch_size):
    # Catch input size mismatches
    model_input_shape = model.input_shape[1:]
    if model_input_shape != x.shape[1:]:
        print(f"Input data size {x.shape[1:]} does not match model input size {model_input_shape}",
              flush=True)        
    # Predict separately if clustering is used
    if cluster_labels:
        cluster_labels = np.asarray(cluster_labels, dtype=bool)
        # Combine results
        prediction = np.zeros_like(x)
        prediction[cluster_labels] = model_reactive.predict(x[cluster_labels], batch_size)
        prediction[~cluster_labels] = model.predict(x[~cluster_labels], batch_size)
    else:
        prediction = model.predict(x, batch_size)
    return np.array(prediction, dtype=np.float64)
 def get_weights(model):
    weights = model.get_weights()
    return weights
 def training_step(model, x, y, batch_size, epochs, 
                  output_file_path):
    history = model.fit(x, y, 
                        epochs=epochs,
                        batch_size=batch_size)
    if output_file_path:
        model.save(output_file_path)
    return history
--- a/src/Chemistry/SurrogateModels/AI_functions.cpp
+++ b/src/Chemistry/SurrogateModels/AI_functions.cpp
@ -0,0 +1,880 @@
 #include <iostream>
 #include <string>
 #include <cstring>
 #include <vector>
 #include <Python.h>
 #include <numpy/arrayobject.h>
 #include <Eigen/Dense>
 #include <thread>
 #include <mutex>
 #include <condition_variable>
 #include "AI_functions.hpp"
 #include "poet.hpp"
 using namespace std;
 namespace poet {
 /**
 * @brief Loads the Python interpreter and functions  
 * @param functions_file_path Path to the Python file where the AI surrogate
 * functions are defined
 * @return 0 if function was succesful
 */
 int Python_Keras_setup(std::string functions_file_path, std::string cuda_src_dir) {
  // Initialize Python functions
  Py_Initialize();
  // Import numpy functions
  _import_array();
  PyRun_SimpleString(("cuda_dir = \"" + cuda_src_dir + "\"").c_str()) ;
  FILE* fp = fopen(functions_file_path.c_str(), "r");
  int py_functions_initialized = PyRun_SimpleFile(fp, functions_file_path.c_str());
  fclose(fp);
  if (py_functions_initialized != 0) {
    PyErr_Print();
    throw std::runtime_error(std::string("AI surrogate Python functions could not be loaded." ) + 
                             "Are tensorflow and numpy installed?");
  }
  return py_functions_initialized;
 }
 /**
 * @brief Loads the user-supplied Keras model  
 * @param model_file_path Path to a .keras file that the user must supply as
 * a variable "model_file_path" in the R input script
 * @return 0 if function was succesful
 */
 int Python_Keras_load_model(std::string model, std::string model_reactive, bool use_clustering) {
  // Acquire the Python GIL
  PyGILState_STATE gstate = PyGILState_Ensure();
  // Initialize Keras default model
  int py_model_loaded = PyRun_SimpleString(("model = initiate_model(\"" + 
                                             model + "\")").c_str());
  if (py_model_loaded != 0) {
    PyErr_Print();
    throw std::runtime_error("Keras model could not be loaded from: " + model);
  }
  if (use_clustering) {
    // Initialize second Keras model that will be used for the "reaction" cluster
    py_model_loaded = PyRun_SimpleString(("model_reactive = initiate_model(\"" + 
                                          model_reactive + "\")").c_str());
    if (py_model_loaded != 0) {
      PyErr_Print();
      throw std::runtime_error("Keras model could not be loaded from: " + model_reactive);
    }
  }
  // Release the Python GIL
  PyGILState_Release(gstate);
  return py_model_loaded;
 }
 /**
 * @brief Calculates the euclidian distance between two points in n dimensional space
 * @param a Point a
 * @param b Point b
 * @return The distance
 */
 double distance(const std::vector<double>& a, const std::vector<double>& b) {
    double sum = 0.0;
    for (size_t i = 0; i < a.size(); ++i) {
        sum += (a[i] - b[i]) * (a[i] - b[i]);
    }
    return sqrt(sum);
 }
 /**
 * @brief Assigns all elements of a 2D-Matrix to the nearest cluster center point
 * @param field 2D-Matrix with the content of a Field object
 * @param clusters The vector of clusters represented by their center points
 * @return A vector that contains the assigned cluster for each of the rows in field
 */
 std::vector<int> assign_clusters(const std::vector<vector<double>>& field, const std::vector<vector<double>>& clusters) {
  // Initiate a vector that holds the cluster labels of each row
  std::vector<int> labels(field[0].size());
 for (size_t row = 0; row < labels.size(); row++) {
    // Get the coordinates of the current row
    std::vector<double> row_data(field.size()); 
    for (size_t column = 0; column < row_data.size(); column++) {
      row_data[column] = field[column][row];
    }
    // Iterate over the clusters and check which cluster center is the closest
    double current_min_distance = numeric_limits<double>::max();
    int current_closest_cluster;
    for (size_t cluster = 0; cluster < clusters.size(); cluster++) {
      double cluster_distance = distance(row_data, clusters[cluster]);
      if (cluster_distance < current_min_distance) {
        current_min_distance = cluster_distance;
        current_closest_cluster = cluster;
      }
    }
    labels[row] = current_closest_cluster;
  }
  return labels;
 }
 /**
 * @brief Calculates new center points for each given cluster by averaging the coordinates
 * of all points that are assigen to it
 * @param field 2D-Matrix with the content of a Field object
 * @param labels The vector that contains the assigned cluster for each of the rows in field
 * @param k The number of clusters
 * @return The new cluster center points
 */
 std::vector<vector<double>> calculate_new_clusters(const std::vector<std::vector<double>>& field,
                                                   const vector<int>& labels, int k) {
  size_t columns = field.size();
  size_t rows = field[0].size();
  std::vector<std::vector<double>> clusters(k, std::vector<double>(columns, 0.0));
  vector<int> count(k, 0);
  // Sum the coordinates of all points that are assigned to each cluster 
  for (size_t row = 0; row < rows; row++) {
    int assigned_cluster = labels[row];
    for (size_t column = 0; column < columns; column++) {
      clusters[assigned_cluster][column] += field[column][row];
    }
    count[assigned_cluster]++;
    }
  // Take the average of the summed coordinates
  for (size_t cluster = 0; cluster < k; cluster++) {
    if (count[cluster] == 0) continue;
    for (size_t column = 0; column < columns; column++) {
      clusters[cluster][column] /= count[cluster];
    }
  }
  return clusters;
 }
 /**
 * @brief Performs KMeans clustering for the elements of a 2D-Matrix
 * @param field 2D-Matrix with the content of a Field object
 * @param k The number of different clusters
 * @param iterations The number of cluster update steps
 * @return A vector that contains the assigned cluster for each of the rows in field
 */
 std::vector<int> K_Means(std::vector<std::vector<double>>& field, int k, int iterations) {
  // Initialize cluster centers by selecting random points from the field 
  srand(time(0));
  std::vector<vector<double>> clusters;
  for (size_t i = 0; i < k; ++i) {
    std::vector<double> cluster_center(field.size());
    int row = rand() % field.size();
    for (size_t column = 0; column < cluster_center.size(); column++) {
      cluster_center[column] = field[column][row];
    }
    clusters.push_back(cluster_center);
  } 
  std::vector<int> labels;
  for (size_t iter = 0; iter < iterations; ++iter) {
    // Get the nearest cluster for each row
    labels = assign_clusters(field, clusters);
    // Update each cluster center as the average location of each point assigned to it
    std::vector<vector<double>> new_clusters = calculate_new_clusters(field, labels, k);
    clusters = new_clusters;
  }
  // Always define the reactive cluster as cluster 1
  // Interprete the reactive cluster as the one on the origin of the field
  // TODO: Is that always correct?  
  int reactive_cluster = labels[0];
  if (reactive_cluster == 0) {
    for (size_t i; i < labels.size(); i++) {
          labels[i] = 1 - labels[i];
    }
  }
  return labels;
 }
 /**
 * @brief Converts the std::vector 2D matrix representation of a POET Field object to a numpy array
 * for use in the Python AI surrogate functions
 * @param field 2D-Matrix with the content of a Field object
 * @return Numpy representation of the input vector
 */
 PyObject* vector_to_numpy_array(const std::vector<std::vector<double>>& field) {
  npy_intp dims[2] = {static_cast<npy_intp>(field[0].size()), 
                      static_cast<npy_intp>(field.size())};
  PyObject* np_array = PyArray_SimpleNew(2, dims, NPY_FLOAT64);
  double* data = static_cast<double*>(PyArray_DATA((PyArrayObject*)np_array));
  // write field data to numpy array
  for (size_t i = 0; i < field.size(); ++i) {
    for (size_t j = 0; j < field[i].size(); ++j) {
      data[j * field.size() + i] = field[i][j];
    }
  }
  return np_array;
 }
 /**
 * @brief Converts a Pyton matrix object to a std::vector vector
 * @param py_matrix Pyobject that must be a 2D matrix
 * @result Vector that can be used similar to the return value of the Field object
 * Field.AsVector() method.
 */
 std::vector<double> numpy_array_to_vector(PyObject* py_array) {
  std::vector<double> result;
  if (!PyArray_Check(py_array)) {
    std::cerr << "The model's output is not a numpy array." << std::endl;
    return result;
  }
  // Cast generic PyObject to PyArrayObject
  PyArrayObject* np_array = reinterpret_cast<PyArrayObject*>(py_array);
  // Get shape
  int numDims = PyArray_NDIM(np_array);
  npy_intp* shape = PyArray_SHAPE(np_array);
  if (numDims != 2) {
    std::cerr << "The model's predictions are not a 2D matrix." << std::endl;
    return result;
  }
  // Copy data into std::vector format
  double* data = static_cast<double*>(PyArray_DATA(np_array));
  npy_intp size = PyArray_SIZE(np_array);
  result.resize(size);
  std::copy(data, data + size, result.begin());
  return result;
 }
 /**
 * @brief Uses the Python Keras functions to calculate predictions from a neural network.
 * @param x 2D-Matrix with the content of a Field object
 * @param batch_size size for mini-batches that are used in the Keras model.predict() method
 * @return Predictions that the neural network made from the input values x. The predictions are
 * represented as a vector similar to the representation from the Field.AsVector() method
 */
 std::vector<double> Python_Keras_predict(std::vector<std::vector<double>>& x, int batch_size,
                                         std::vector<int>& cluster_labels) {
  // Acquire the Python GIL
  PyGILState_STATE gstate = PyGILState_Ensure();
  // Prepare data for Python
  PyObject* py_df_x = vector_to_numpy_array(x);
  // Prepare cluster label vector for Python
  PyObject* py_cluster_list = PyList_New(cluster_labels.size());
  for (size_t i = 0; i < cluster_labels.size(); i++) {
    PyObject* py_int = PyLong_FromLong(cluster_labels[i]);
    PyList_SET_ITEM(py_cluster_list, i, py_int);
  }
  // Get the model and inference function from the global python interpreter
  PyObject* py_main_module = PyImport_AddModule("__main__");
  PyObject* py_global_dict = PyModule_GetDict(py_main_module);
  PyObject* py_keras_model = PyDict_GetItemString(py_global_dict, "model");
  PyObject* py_inference_function = PyDict_GetItemString(py_global_dict, "prediction_step");
  // Get secod model if clustering is used
  PyObject* py_keras_model_reactive = Py_None;;
  if (cluster_labels.size() > 0) {
    py_keras_model_reactive = PyDict_GetItemString(py_global_dict, "model_reactive");
  }
  // Build the function arguments as four python objects and an integer
  PyObject* args = Py_BuildValue("(OOOOi)",
      py_keras_model, py_keras_model_reactive, py_df_x, py_cluster_list, batch_size);
  // Call the Python inference function
  PyObject* py_predictions = PyObject_CallObject(py_inference_function, args);
  // Check py_rv and return as 2D vector
  std::vector<double> predictions = numpy_array_to_vector(py_predictions); 
  // Clean up
  PyErr_Print(); 
  Py_XDECREF(py_df_x);
  Py_XDECREF(py_cluster_list);
  Py_XDECREF(args);
  Py_XDECREF(py_predictions);
  // Release the Python GIL
  PyGILState_Release(gstate);
  return predictions;
 }
 /**
 * @brief Uses Eigen for fast inference with the weights and biases of a neural network 
 * @param input_batch Batch of input data that must fit the size of the neural networks input layer
 * @param model Struct of aligned Eigen vectors that hold the neural networks weights and biases.
 * Only supports simple fully connected feed forward networks.
 * @return The batch of predictions made with the neural network weights and biases and the data
 * in input_batch
 */
 Eigen::MatrixXd eigen_inference_batched(const Eigen::Ref<const Eigen::MatrixXd>& input_batch, const EigenModel& model) {
    Eigen::MatrixXd current_layer = input_batch;
    // Process all hidden layers
    for (size_t layer = 0; layer < model.weight_matrices.size() - 1; ++layer) {
        current_layer = (model.weight_matrices[layer] * current_layer);
        current_layer = current_layer.colwise() + model.biases[layer];
        current_layer = current_layer.array().max(0.0);
    }
    // Process output layer (without ReLU)
    size_t output_layer = model.weight_matrices.size() - 1;
    return (model.weight_matrices[output_layer] * current_layer).colwise() + model.biases[output_layer];
 }
 /**
 * @brief Uses the Eigen representation of the two different Keras model weights for fast inference
 * @param model The model for the non reactive cluster of the field (label 0)
 * @param model_reactive The model for the non reactive cluster of the field (label 1)
 * @param x 2D-Matrix with the content of a Field object
 * @param batch_size size for mini-batches that are used in the Keras model.predict() method
 * @param Eigen_model_mutex Mutex that locks the model during inference and prevents updaties from
 * the training thread
 * @param cluster_labels K-Means cluster label dor each row in the field
 * @return Predictions that the neural network made from the input values x. The predictions are
 * represented as a vector similar to the representation from the Field.AsVector() method
 */
 std::vector<double> Eigen_predict_clustered(const EigenModel& model, const EigenModel& model_reactive,
                                            std::vector<std::vector<double>>& x, int batch_size,
                                            std::mutex* Eigen_model_mutex, std::vector<int>& cluster_labels) {
    const int num_samples = x[0].size();
    const int num_features = x.size();
    if (num_features != model.weight_matrices[0].cols() || 
        num_features != model_reactive.weight_matrices[0].cols()) {
        throw std::runtime_error("Input data size " + std::to_string(num_features) + 
            " does not match model input layer sizes" + std::to_string(model.weight_matrices[0].cols()) + 
            " / " + std::to_string(model_reactive.weight_matrices[0].cols()));
    }
    // Convert input data to Eigen matrix
    Eigen::MatrixXd full_input_matrix(num_features, num_samples);
    for (size_t i = 0; i < num_samples; ++i) {
        for (size_t j = 0; j < num_features; ++j) {
            full_input_matrix(j, i) = x[j][i];
        }
    }
    // Create indices for each cluster
    std::vector<int> cluster_0_indices, cluster_1_indices;
    for (size_t i = 0; i < cluster_labels.size(); ++i) {
        if (cluster_labels[i] == 0) {
            cluster_0_indices.push_back(i);
        } else {
            cluster_1_indices.push_back(i);
        }
    }
    // Prepare matrices for each cluster
    Eigen::MatrixXd input_matrix(num_features, cluster_0_indices.size());
    Eigen::MatrixXd input_matrix_reactive(num_features, cluster_1_indices.size());
    // Split data according to cluster labels
    for (size_t i = 0; i < cluster_0_indices.size(); ++i) {
        input_matrix.col(i) = full_input_matrix.col(cluster_0_indices[i]);
    }
    for (size_t i = 0; i < cluster_1_indices.size(); ++i) {
        input_matrix_reactive.col(i) = full_input_matrix.col(cluster_1_indices[i]);
    }
    // Process each cluster
    std::vector<double> result(num_samples * model.weight_matrices.back().rows());
    Eigen_model_mutex->lock();
    if (!cluster_0_indices.empty()) {
        int num_batches_0 = std::ceil(static_cast<double>(cluster_0_indices.size()) / batch_size);
        for (int batch = 0; batch < num_batches_0; ++batch) {
            int start_idx = batch * batch_size;
            int end_idx = std::min((batch + 1) * batch_size, static_cast<int>(cluster_0_indices.size()));
            int current_batch_size = end_idx - start_idx;
            Eigen::MatrixXd batch_data = input_matrix.block(0, start_idx, num_features, current_batch_size);
            Eigen::MatrixXd batch_result = eigen_inference_batched(batch_data, model);
            // Store results in their original positions
            for (size_t i = 0; i < current_batch_size; ++i) {
                int original_idx = cluster_0_indices[start_idx + i];
                for (size_t j = 0; j < batch_result.rows(); ++j) {
                    result[original_idx * batch_result.rows() + j] = batch_result(j, i);
                }
            }
        }
    }
    // Process cluster 1
    if (!cluster_1_indices.empty()) {
        int num_batches_1 = std::ceil(static_cast<double>(cluster_1_indices.size()) / batch_size);
        for (int batch = 0; batch < num_batches_1; ++batch) {
            int start_idx = batch * batch_size;
            int end_idx = std::min((batch + 1) * batch_size, static_cast<int>(cluster_1_indices.size()));
            int current_batch_size = end_idx - start_idx;
            Eigen::MatrixXd batch_data = input_matrix_reactive.block(0, start_idx, num_features, current_batch_size);
            Eigen::MatrixXd batch_result = eigen_inference_batched(batch_data, model_reactive);
            // Store results in their original positions
            for (size_t i = 0; i < current_batch_size; ++i) {
                int original_idx = cluster_1_indices[start_idx + i];
                for (size_t j = 0; j < batch_result.rows(); ++j) {
                    result[original_idx * batch_result.rows() + j] = batch_result(j, i);
                }
            }
        }
    }
    Eigen_model_mutex->unlock();
    return result;
 }
 /**
 * @brief Uses the Eigen representation of the tKeras model weights for fast inference
 * @param model The model weights and biases
 * @param x 2D-Matrix with the content of a Field object
 * @param batch_size size for mini-batches that are used in the Keras model.predict() method
 * @param Eigen_model_mutex Mutex that locks the model during inference and prevents updaties from
 * the training thread
 * @return Predictions that the neural network made from the input values x. The predictions are
 * represented as a vector similar to the representation from the Field.AsVector() method
 */
 std::vector<double> Eigen_predict(const EigenModel& model, std::vector<std::vector<double>> x, int batch_size,
                                  std::mutex* Eigen_model_mutex) {
  // Convert input data to Eigen matrix
  const int num_samples = x[0].size();
  const int num_features = x.size();
  Eigen::MatrixXd full_input_matrix(num_features, num_samples);
  for (size_t i = 0; i < num_samples; ++i) {
    for (size_t j = 0; j < num_features; ++j) {
      full_input_matrix(j, i) = x[j][i];
    }
  }
  std::vector<double> result;
  result.reserve(num_samples * num_features);
  if (num_features != model.weight_matrices[0].cols()) {
    throw std::runtime_error("Input data size " + std::to_string(num_features) + \
      " does not match model input layer of size " + std::to_string(model.weight_matrices[0].cols()));
  }
  int num_batches = std::ceil(static_cast<double>(num_samples) / batch_size);
  Eigen_model_mutex->lock();
  for (int batch = 0; batch < num_batches; ++batch) {
    int start_idx = batch * batch_size;
    int end_idx = std::min((batch + 1) * batch_size, num_samples);
    int current_batch_size = end_idx - start_idx;
    // Extract the current input data batch
    Eigen::MatrixXd batch_data(num_features, current_batch_size);
    batch_data = full_input_matrix.block(0, start_idx, num_features, current_batch_size);
    // Predict
    batch_data = eigen_inference_batched(batch_data, model);
    result.insert(result.end(), batch_data.data(), batch_data.data() + batch_data.size());
  }
  Eigen_model_mutex->unlock();
  return result;
 }
 /**
 * @brief Appends data from one matrix (column major std::vector<std::vector<double>>) to another
 * @param training_data_buffer Matrix that the values are appended to
 * @param new_values Matrix that is appended
 */
 void training_data_buffer_append(std::vector<std::vector<double>>& training_data_buffer, 
                                 std::vector<std::vector<double>>& new_values) {
  // Initialize training data buffer if empty
  if (training_data_buffer.size() == 0) {
    training_data_buffer = new_values;
  } else { // otherwise append
    for (size_t col = 0; col < training_data_buffer.size(); col++) {
      training_data_buffer[col].insert(training_data_buffer[col].end(),
                                        new_values[col].begin(), new_values[col].end());
    }
  }
 }
 /**
 * @brief Appends data from one int vector to another based on a mask vector
 * @param labels Vector that the values are appended to
 * @param new_labels Values that are appended
 * @param validity Mask vector that defines how many and which values are appended
 */
 void cluster_labels_append(std::vector<int>& labels_buffer, std::vector<int>& new_labels,
                           std::vector<int> validity) {
  // Calculate new buffer size from number of valid elements in mask
  int n_invalid = validity.size();
  for (size_t i = 0; i < validity.size(); i++) {
    n_invalid -= validity[i];
  }
  // Resize label vector to hold non valid elements
  int end_index = labels_buffer.size(); 
  int new_size = end_index + n_invalid;
  labels_buffer.resize(new_size);
  // Iterate over mask to transfer cluster labels
  for (size_t i = 0; i < validity.size(); ++i) {
    // Append only the labels of invalid rows 
    if (!validity[i]) {
      labels_buffer[end_index] = new_labels[i]; 
      end_index++;
    }
  }
 }
 /**
 * @brief Uses the Python environment with Keras' default functions to train the model
 * @param x Training data features
 * @param y Training data targets
 * @param params Global runtime paramters
 */
 void Python_Keras_train(std::vector<std::vector<double>>& x, std::vector<std::vector<double>>& y,
                        int train_cluster, std::string model_name, const RuntimeParameters& params) {
  // Prepare data for python
  PyObject* py_df_x = vector_to_numpy_array(x);
  PyObject* py_df_y = vector_to_numpy_array(y);
  // Make sure that model output file name .keras file
  std::string model_path = params.save_model_path; 
  if (!model_path.empty()) {
    if (model_path.length() >= 6 && model_path.substr(model_path.length() - 6) != ".keras") {
        model_path += ".keras";
    }
  }
  // Choose the correct model to train if clustering is used
  if (train_cluster == 1) {
    if (!model_path.empty()) {
      model_path.insert(model_path.length() - 6, "_reaction");
      std::cout << "MODEL SAVED AS:" << model_path << std::endl;
    }
  }
  // Get the model and training function from the global python interpreter
  PyObject* py_main_module = PyImport_AddModule("__main__");
  PyObject* py_global_dict = PyModule_GetDict(py_main_module);
  PyObject* py_keras_model = PyDict_GetItemString(py_global_dict, model_name.c_str());
  PyObject* py_training_function = PyDict_GetItemString(py_global_dict, "training_step");
  // Build the function arguments as four python objects and an integer
  PyObject* args = Py_BuildValue("(OOOiis)", 
      py_keras_model, py_df_x, py_df_y, params.batch_size, params.training_epochs,
      model_path.c_str());
  // Call the Python training function
  PyObject *py_rv = PyObject_CallObject(py_training_function, args);
  // Clean up
  PyErr_Print(); 
  Py_DECREF(py_df_x);
  Py_DECREF(py_df_y);
  Py_DECREF(args);
 }
 /**
 * @brief Function for threadsafe parallel training and weight updating.
 * The function waits conditionally until the training data buffer is full.
 * It then clears the buffer and starts training, after training it writes the new weights to 
 * the Eigen model.  
 * @param Eigen_model Pointer to the EigenModel struct that will be updates with new weights
 * @param Eigen_model_mutex Mutex to ensure threadsafe access to the EigenModel struct
 * @param training_data_buffer Pointer to the Training data struct with which the model is trained
 * @param training_data_buffer_mutex Mutex to ensure threadsafe access to the training data struct
 * @param training_data_buffer_full Conditional waiting variable with wich the main thread signals
 * when a training run can start
 * @param start_training Conditional waiting predicate to mitigate against spurious wakeups
 * @param end_training Signals end of program to wind down thread gracefully 
 * @param params Global runtime paramters
 * @return 0 if function was succesful
 */
 void parallel_training(EigenModel* Eigen_model, EigenModel* Eigen_model_reactive,
                       std::mutex* Eigen_model_mutex,
                       TrainingData* training_data_buffer,
                       std::mutex* training_data_buffer_mutex,
                       std::condition_variable* training_data_buffer_full,
                       bool* start_training, bool* end_training,
                       const RuntimeParameters& params) {
  while (true) {
    // Conditional waiting:
    // - Sleeps until a signal arrives on training_data_buffer_full
    // - Releases the lock on training_data_buffer_mutex while sleeping
    // - Lambda function with start_training checks if it was a spurious wakeup
    // - Reaquires the lock on training_data_buffer_mutex after waking up
    // - If start_training has been set to true while the thread was active, it does NOT
    //   wait for a signal on training_data_buffer_full but starts the next round immediately.
    std::unique_lock<std::mutex> lock(*training_data_buffer_mutex);
    training_data_buffer_full->wait(lock, [start_training] { return *start_training;});
    // Return if program is about to end
    if (*end_training) {
      return;
    }
    // Get the necessary training data
    std::cout << "AI: Training thread: Getting training data" << std::endl;
    // Initialize training data input and targets
    std::vector<std::vector<double>> inputs(training_data_buffer->x.size(),
                                            std::vector<double>(params.training_data_size));
    std::vector<std::vector<double>> targets(training_data_buffer->y.size(),
                                             std::vector<double>(params.training_data_size));
    int buffer_size = training_data_buffer->x[0].size();
    // If clustering is used, check the current cluster
    int n_cluster_reactive = 0;
    int train_cluster = -1; // Default value for non clustered training (all data is used)
    if (params.use_k_means_clustering) {
      for (size_t i = 0; i < buffer_size; i++) {
          n_cluster_reactive += training_data_buffer->cluster_labels[i];
      }
      train_cluster = n_cluster_reactive >= params.training_data_size;
    }
    int buffer_row = 0;
    int copied_row = 0;
    while (copied_row < params.training_data_size) {
      if ((train_cluster == -1) ||
          (train_cluster == training_data_buffer->cluster_labels[buffer_row])) {
        for (size_t col = 0; col < training_data_buffer->x.size(); col++) {
          // Copy and remove from training data buffer
          inputs[col][copied_row] = training_data_buffer->x[col][buffer_row];
          targets[col][copied_row] = training_data_buffer->y[col][buffer_row];
          training_data_buffer->x[col].erase(training_data_buffer->x[col].begin() + 
                                            buffer_row);
          training_data_buffer->y[col].erase(training_data_buffer->y[col].begin() +
                                            buffer_row);
        }
        // Remove from cluster label buffer
        if (params.use_k_means_clustering) {
          training_data_buffer->cluster_labels.erase(
            training_data_buffer->cluster_labels.begin() + buffer_row);
        }
        copied_row++;
      } else {
        buffer_row++;
      }
    }
    // Set the waiting predicate to immediately continue training if enough elements of any cluster remain
    if (train_cluster == 1) {
      *start_training = ((n_cluster_reactive - params.training_data_size) >= params.training_data_size) ||
                        ((buffer_size - n_cluster_reactive) >= params.training_data_size);
    } else {
      *start_training = (buffer_size - n_cluster_reactive - params.training_data_size)
                        >= params.training_data_size;
    }
    //update number of training runs
    training_data_buffer->n_training_runs += 1;
    // Unlock the training_data_buffer_mutex 
    lock.unlock();
    std::string model_name = "model";
    if (train_cluster == 1) {
      model_name = "model_reactive";
    }
    std::cout << "AI: Training thread: Start training " << model_name << std::endl;
    // Acquire the Python GIL
    PyGILState_STATE gstate = PyGILState_Ensure();
    // Start training
    Python_Keras_train(inputs, targets, train_cluster, model_name, params);
    if (!params.use_Keras_predictions) {
      std::cout << "AI: Training thread: Update shared model weights" << std::endl;
      std::vector<std::vector<std::vector<double>>> cpp_weights = Python_Keras_get_weights(model_name);
      Eigen_model_mutex->lock();
      if (train_cluster == 1) {
        update_weights(Eigen_model_reactive, cpp_weights);
      } else {
        update_weights(Eigen_model, cpp_weights);
      }
      Eigen_model_mutex->unlock();
    }
    // Release the Python GIL
    PyGILState_Release(gstate);
    std::cout << "AI: Training thread: Finished training, waiting for new data" << std::endl;
  }
 }
 std::thread python_train_thread;
 /**
 * @brief Starts a thread for parallel training and weight updating. This Wrapper function
 * ensures, that the main POET program can be built without pthread support if the AI
 * surrogate functions are disabled during compilation. 
 * @param Eigen_model Pointer to the EigenModel struct that will be updates with new weights
 * @param Eigen_model_mutex Mutex to ensure threadsafe access to the EigenModel struct
 * @param training_data_buffer Pointer to the Training data struct with which the model is trained
 * @param training_data_buffer_mutex Mutex to ensure threadsafe access to the training data struct
 * @param training_data_buffer_full Conditional waiting variable with wich the main thread signals
 * when a training run can start
 * @param start_training Conditional waiting predicate to mitigate against spurious wakeups
 * @param end_training Signals end of program to wind down thread gracefully 
 * @param params Global runtime paramters
 * @return 0 if function was succesful
 */
 int Python_Keras_training_thread(EigenModel* Eigen_model, EigenModel* Eigen_model_reactive,
                                 std::mutex* Eigen_model_mutex,
                                 TrainingData* training_data_buffer,
                                 std::mutex* training_data_buffer_mutex,
                                 std::condition_variable* training_data_buffer_full,
                                 bool* start_training, bool* end_training,
                                 const RuntimeParameters& params) {
  PyThreadState *_save = PyEval_SaveThread();
  python_train_thread = std::thread(parallel_training, Eigen_model, Eigen_model_reactive,
                                    Eigen_model_mutex, training_data_buffer,
                                    training_data_buffer_mutex, training_data_buffer_full,
                                    start_training, end_training, params);
  return 0;
 }
 /**
 * @brief Updates the EigenModels weigths and biases from the weight vector
 * @param model Pounter to an EigenModel struct
 * @param weights Cector of model weights from keras as returned by Python_Keras_get_weights()
 */
 void update_weights(EigenModel* model,
                    const std::vector<std::vector<std::vector<double>>>& weights) {
  size_t num_layers = weights.size() / 2;
  for (size_t i = 0; i < weights.size(); i += 2) {
      // Fill current weight matrix
      size_t rows = weights[i][0].size();
      size_t cols = weights[i].size();
      for (size_t j = 0; j < cols; ++j) {
          for (size_t k = 0; k < rows; ++k) {
              model->weight_matrices[i / 2](k, j) = weights[i][j][k];
          }
      }
      // Fill bias vector
      size_t bias_size = weights[i + 1][0].size();
      for (size_t j = 0; j < bias_size; ++j) {
          model->biases[i / 2](j) = weights[i + 1][0][j];
      }
  }
 }
 /**
 * @brief Converts the weights and biases from the Python Keras model to C++ vectors
 * @return A vector containing the model weights and biases 
 */
 std::vector<std::vector<std::vector<double>>> Python_Keras_get_weights(std::string model_name) {
  // Acquire the Python GIL
  PyGILState_STATE gstate = PyGILState_Ensure();
  PyObject* py_main_module = PyImport_AddModule("__main__");
  PyObject* py_global_dict = PyModule_GetDict(py_main_module);
  PyObject* py_keras_model = PyDict_GetItemString(py_global_dict, model_name.c_str());
  PyObject* py_get_weights_function = PyDict_GetItemString(py_global_dict, "get_weights");
  PyObject* args = Py_BuildValue("(O)", py_keras_model);
  // Call Python function
  PyObject* py_weights_list = PyObject_CallObject(py_get_weights_function, args);
  if (!py_weights_list) {
      PyErr_Print();
      throw std::runtime_error("Failed to get weights from Keras model");
  }
  // Container for the extracted weights
  std::vector<std::vector<std::vector<double>>> cpp_weights;
  // Iterate through the layers (weights and biases)
  Py_ssize_t num_layers = PyList_Size(py_weights_list);
  for (Py_ssize_t i = 0; i < num_layers; ++i) {
    PyObject* py_weight_array = PyList_GetItem(py_weights_list, i);
    if (!PyArray_Check(py_weight_array)) {
      throw std::runtime_error("Weight is not a NumPy array.");
    }
    PyArrayObject* weight_np = reinterpret_cast<PyArrayObject*>(py_weight_array);
    int dtype = PyArray_TYPE(weight_np);
    // If array is 2D it's a weight matrix
    if (PyArray_NDIM(weight_np) == 2) {
      int num_rows = PyArray_DIM(weight_np, 0);
      int num_cols = PyArray_DIM(weight_np, 1);
      std::vector<std::vector<double>> weight_matrix(num_rows, std::vector<double>(num_cols));
      // Handle different precision settings
      if (dtype == NPY_FLOAT32) {
        float* weight_data_float = static_cast<float*>(PyArray_DATA(weight_np));
        for (size_t r = 0; r < num_rows; ++r) {
          for (size_t c = 0; c < num_cols; ++c) {
            weight_matrix[r][c] = static_cast<double>(weight_data_float[r * num_cols + c]);
          }
        }
      } else if (dtype == NPY_DOUBLE) {
        double* weight_data_double = static_cast<double*>(PyArray_DATA(weight_np));
        for (size_t r = 0; r < num_rows; ++r) {
          for (size_t c = 0; c < num_cols; ++c) {
            weight_matrix[r][c] = weight_data_double[r * num_cols + c];
          }
        }
      } else {
          throw std::runtime_error("Unsupported data type for weights. Must be NPY_FLOAT32 or NPY_DOUBLE.");
      }
      cpp_weights.push_back(weight_matrix);
    // If array is 1D it's a bias vector 
    } else if (PyArray_NDIM(weight_np) == 1) {
      int num_elements = PyArray_DIM(weight_np, 0);
      std::vector<std::vector<double>> bias_vector(1, std::vector<double>(num_elements));
      // Handle different precision settings
      if (dtype == NPY_FLOAT32) {
        float* bias_data_float = static_cast<float*>(PyArray_DATA(weight_np));
        for (size_t j = 0; j < num_elements; ++j) {
          bias_vector[0][j] = static_cast<double>(bias_data_float[j]);
        }
      } else if (dtype == NPY_DOUBLE) {
        double* bias_data_double = static_cast<double*>(PyArray_DATA(weight_np));
        for (size_t j = 0; j < num_elements; ++j) {
          bias_vector[0][j] = bias_data_double[j];
        }
      } else {
        throw std::runtime_error("Unsupported data type for biases. Must be NPY_FLOAT32 or NPY_DOUBLE.");
      }
      cpp_weights.push_back(bias_vector);
    }
  }
  // Clean up
  Py_DECREF(py_weights_list);
  Py_DECREF(args);
  // Release Python GIL
  PyGILState_Release(gstate);
  return cpp_weights;
 }
 /**
 * @brief Joins the training thread and winds down the Python environment gracefully
 * @param Eigen_model_mutex Mutex to ensure threadsafe access to the EigenModel struct
 * @param training_data_buffer_mutex Mutex to ensure threadsafe access to the training data struct
 * @param training_data_buffer_full Conditional waiting variable with wich the main thread signals
 * when a training run can start
 * @param start_training Conditional waiting predicate to mitigate against spurious wakeups
 * @param end_training Signals end of program to wind down thread gracefully */
 void Python_finalize(std::mutex* Eigen_model_mutex, std::mutex* training_data_buffer_mutex,
                     std::condition_variable* training_data_buffer_full,
                     bool* start_training, bool* end_training) {
  training_data_buffer_mutex->lock();
  // Define training as over
  *end_training = true;
  // Wake up and join training thread
  *start_training = true;
  training_data_buffer_mutex->unlock();
  training_data_buffer_full->notify_one();
  if (python_train_thread.joinable()) {
    python_train_thread.join();
  }
  // Acquire the Python GIL
  PyGILState_STATE gstate = PyGILState_Ensure();
  //Finalize Python
  Py_FinalizeEx();
 }
 } //namespace poet
--- a/src/Chemistry/SurrogateModels/AI_functions.hpp
+++ b/src/Chemistry/SurrogateModels/AI_functions.hpp
@ -0,0 +1,120 @@
 /**
 * @file AI_functions.hpp
 * @author Hans Straile (straile@uni-potsdam.de)
 * @brief API for the AI/Machine Learning based chemistry surrogate model with functions to initialize a neural network and use it for training and inference via Keras for Python . 
 * @version 0.1
 * @date 01 Nov 2024
 *
 * This file implements functions to train and predict with a neural network.
 * All functions are based on a user supplied Keras model. Pyhton.h is used for model
 * training with Keras and can be used for inference. The default inference funtion 
 * is implemented with Eigen matrices in C++. All functions use 2 different models
 * to process data separately according to a K-means cluster assignement. This file
 * alo contains the functions for the K-means algorithm.
 *  
 */
 #ifndef AI_FUNCTIONS_H
 #define AI_FUNCTIONS_H
 #include <string>
 #include <vector>
 #include "poet.hpp"
 // PhreeqC definition of pi clashes with Eigen macros
 // so we have to temporarily undef it 
 #pragma push_macro("pi")
 #undef pi
 #include <Eigen/Dense> 
 #pragma pop_macro("pi")
 namespace poet {
 struct EigenModel {
  // The first model will be used for all values if clustering is disabled
  // or for the reactive part of the field if clustering is enabled
  std::vector<Eigen::MatrixXd> weight_matrices;
  std::vector<Eigen::VectorXd> biases;
  // The other model will be used for the non-reactive cluster
  // (if clustering is enabled)
  std::vector<Eigen::MatrixXd> weight_matrices_no_reaction;
  std::vector<Eigen::VectorXd> biases_no_reaction;
 };
 struct TrainingData {
  std::vector<std::vector<double>> x;
  std::vector<std::vector<double>> y;
  std::vector<int> cluster_labels;
  int n_training_runs = 0;
 };
 // Ony declare the actual functions if flag is set 
 #ifdef USE_AI_SURROGATE
 int Python_Keras_setup(std::string functions_file_path, std::string cuda_src_dir);
 void Python_finalize(std::mutex* Eigen_model_mutex, std::mutex* training_data_buffer_mutex,
                     std::condition_variable* training_data_buffer_full, bool* start_training, 
                     bool* end_training);
 int Python_Keras_load_model(std::string model, std::string model_reactive,
                            bool use_clustering);
 std::vector<int> K_Means(std::vector<std::vector<double>>& field, int k, int maxIterations = 100);
 std::vector<double> Python_Keras_predict(std::vector<std::vector<double>>& x, int batch_size,
                                         std::vector<int>& cluster_labels);  
 void training_data_buffer_append(std::vector<std::vector<double>>& training_data_buffer,
                                 std::vector<std::vector<double>>& new_values);
 void cluster_labels_append(std::vector<int>& labels, std::vector<int>& new_labels,
                           std::vector<int> validity);
 int Python_Keras_training_thread(EigenModel* Eigen_model, EigenModel* Eigen_model_reactive,
                                 std::mutex* Eigen_model_mutex,
                                 TrainingData* training_data_buffer,
                                 std::mutex* training_data_buffer_mutex,
                                 std::condition_variable* training_data_buffer_full,
                                 bool* start_training, bool* end_training,
                                 const RuntimeParameters& params);
 void update_weights(EigenModel* model, const std::vector<std::vector<std::vector<double>>>& weights);
 std::vector<std::vector<std::vector<double>>> Python_Keras_get_weights(std::string model_name);
 std::vector<double> Eigen_predict_clustered(const EigenModel& model, const EigenModel& model_reactive,
                                            std::vector<std::vector<double>>& x, 
                                            int batch_size, std::mutex* Eigen_model_mutex,
                                            std::vector<int>& cluster_labels);
 std::vector<double> Eigen_predict(const EigenModel& model, std::vector<std::vector<double>> x, int batch_size,
                                  std::mutex* Eigen_model_mutex);
 // Otherwise, define the necessary stubs
 #else
 inline void Python_Keras_setup(std::string, std::string){}
 inline void Python_finalize(std::mutex*, std::mutex*, std::condition_variable*, bool*, bool*){}
 inline void Python_Keras_load_model(std::string, std::string, bool){}
 inline std::vector<int> K_Means(std::vector<std::vector<double>>&, int, int) {return {};}
 inline std::vector<double> Python_Keras_predict(std::vector<std::vector<double>>&, int,
                                                std::vector<int>&){return {};}
 inline void training_data_buffer_append(std::vector<std::vector<double>>&,
                                        std::vector<std::vector<double>>&){}
 inline void cluster_labels_append(std::vector<int>&, std::vector<int>&, std::vector<int>){}
 inline int Python_Keras_training_thread(EigenModel*, EigenModel*, std::mutex*, 
                                        TrainingData*, std::mutex*, std::condition_variable*,
                                        bool*, bool*, const RuntimeParameters&){return {};}
 inline void update_weights(EigenModel*, const std::vector<std::vector<std::vector<double>>>&){}
 inline std::vector<std::vector<std::vector<double>>> Python_Keras_get_weights(std::string){return {};}
 inline std::vector<double> Eigen_predict_clustered(const EigenModel&, const EigenModel&,
                                                   std::vector<std::vector<double>>&, int,
                                                   std::mutex*, std::vector<int>&){return {};}
 inline std::vector<double> Eigen_predict(const EigenModel&, std::vector<std::vector<double>>, int,
                                         std::mutex*){return {};}
 #endif
 } // namespace poet
 #endif // AI_FUNCTIONS_HPP
--- a/src/Chemistry/WorkerFunctions.cpp
+++ b/src/Chemistry/WorkerFunctions.cpp
@ -161,7 +161,15 @@ void poet::ChemistryModule::WorkerDoWork(MPI_Status &probe_status,
                            mpi_buffer.begin() + this->prop_count * (wp_i + 1));
  }
-  // std::cout << this->comm_rank << ":" << counter++ << std::endl;
+  if (this->ai_surrogate_enabled) {
    // Map valid predictions from the ai surrogate in the workpackage
    for (int i = 0; i < s_curr_wp.size; i++) {
      if (this->ai_surrogate_validity_vector[wp_start_index + i] == 1) {
        s_curr_wp.mapping[i] = CHEM_AISURR;
      }
    }
  }
  if (dht_enabled || interp_enabled) {
    dht->prepareKeys(s_curr_wp.input, dt);
  }
@ -178,16 +186,6 @@ void poet::ChemistryModule::WorkerDoWork(MPI_Status &probe_status,
    interp->tryInterpolation(s_curr_wp);
  }
  if (this->ai_surrogate_enabled) {
    // Map valid predictions from the ai surrogate in the workpackage
    for (int i = 0; i < s_curr_wp.size; i++) {
      if (this->ai_surrogate_validity_vector[wp_start_index + i] == 1) {
        s_curr_wp.mapping[i] = CHEM_AISURR;
      }
    }
  }
  phreeqc_time_start = MPI_Wtime();
  WorkerRunWorkPackage(s_curr_wp, current_sim_time, dt);
--- a/src/Init/ChemistryInit.cpp
+++ b/src/Init/ChemistryInit.cpp
@ -51,8 +51,8 @@ void InitialList::initChemistry(const Rcpp::List &chem) {
    // Get base path
    ai_surrogate_input_script_path = ai_surrogate_input_script_path.substr(0, ai_surrogate_input_script_path.find_last_of('/') + 1);
    // Add the filepath as a global variable in R to enable relative filepaths in the R script
-    fileContent += "\nai_surrogate_base_path <- \"" + ai_surrogate_input_script_path + "\"";
+    fileContent.insert(0, "ai_surrogate_base_path <- \"" + ai_surrogate_input_script_path + "\"\n");
-    
+
    this->ai_surrogate_input_script = fileContent;
  }
--- a/src/poet.cpp
+++ b/src/poet.cpp
@ -28,7 +28,6 @@
 #include "DataStructures/Field.hpp"
 #include "Init/InitialList.hpp"
 #include "Transport/DiffusionModule.hpp"
 #include <RInside.h>
 #include <Rcpp.h>
 #include <Rcpp/DataFrame.h>
@ -39,9 +38,10 @@
 #include <memory>
 #include <mpi.h>
 #include <string>
-
+#include <mutex>
 #include <condition_variable>
 #include "Chemistry/SurrogateModels/AI_functions.hpp"
 #include <CLI/CLI.hpp>
 #include <poet.hpp>
 #include <vector>
@ -72,6 +72,7 @@ static void init_global_functions(RInside &R) {
  SaveRObj_R = DEFunc("SaveRObj");
 }
 // HACK: this is a step back as the order and also the count of fields is
 // predefined, but it will change in the future
 // static inline void writeFieldsToR(RInside &R, const Field &trans,
@ -182,6 +183,12 @@ int parseInitValues(int argc, char **argv, RuntimeParameters &params) {
    MSG("Work Package Size: " + std::to_string(params.work_package_size));
    MSG("DHT is " + BOOL_PRINT(params.use_dht));
    MSG("AI Surrogate is " + BOOL_PRINT(params.use_ai_surrogate));
    #ifndef USE_AI_SURROGATE
      if (params.use_ai_surrogate) {
        throw std::runtime_error("AI Surrogate functions can only be used if they are included during compile time.\n \
          Please use the CMake flag -DUSE_AI_SURROGATE=ON.");
      }
    #endif
    if (params.use_dht) {
      // MSG("DHT strategy is " + std::to_string(simparams.dht_strategy));
@ -274,17 +281,87 @@ static Rcpp::List RunMasterLoop(RInsidePOET &R, const RuntimeParameters &params,
  /* Iteration Count is dynamic, retrieving value from R (is only needed by
   * master for the following loop) */
  uint32_t maxiter = params.timesteps.size();
-
+  
  if (params.print_progress) {
    chem.setProgressBarPrintout(true);
-  }
+  } 
  R["TMP_PROPS"] = Rcpp::wrap(chem.getField().GetProps());
-  /* SIMULATION LOOP */
+  /* For the weights and biases of the AI surrogate
   * model to use in an inference function with Eigen */
  std::mutex Eigen_model_mutex; 
  static EigenModel Eigen_model;
  /* With K Means clustering, a second model will be used 
   * only for the reactive part of the field (and the first
   * model only for the non-reactive part) */
  static EigenModel Eigen_model_reactive;
  /* For the training data */
  std::mutex training_data_buffer_mutex;
  std::condition_variable training_data_buffer_full;
  bool start_training, end_training;
  TrainingData training_data_buffer;
  std::vector<int> cluster_labels;
  if (params.use_ai_surrogate) {  
    MSG("AI: Initialize model");
    // Initiate two models from one file
    Python_Keras_load_model(R["model_file_path"], R["model_reactive_file_path"],
                            params.use_k_means_clustering);
    if (!params.disable_training) {
      MSG("AI: Initialize training thread");
      Python_Keras_training_thread(&Eigen_model, &Eigen_model_reactive, 
                                   &Eigen_model_mutex, &training_data_buffer, 
                                   &training_data_buffer_mutex,
                                   &training_data_buffer_full, &start_training,
                                   &end_training, params);
    }
    if (!params.use_Keras_predictions) {
      // Initialize Eigen model for custom inference function
      MSG("AI: Use custom C++ prediction function");
      // Get Keras weights from Python
      std::vector<std::vector<std::vector<double>>> cpp_weights = Python_Keras_get_weights("model");
      // Set model size
      size_t num_layers = cpp_weights.size() / 2;
      Eigen_model.weight_matrices.resize(num_layers); 
      Eigen_model.biases.resize(num_layers);
      for (size_t i = 0; i < cpp_weights.size(); i += 2) {
          size_t rows = cpp_weights[i][0].size();
          size_t cols = cpp_weights[i].size();
          Eigen_model.weight_matrices[i / 2].resize(rows, cols);
          size_t bias_size = cpp_weights[i + 1][0].size();
          Eigen_model.biases[i / 2].resize(bias_size);
      }
      // Set initial model weights
      update_weights(&Eigen_model, cpp_weights);
      if (params.use_k_means_clustering) {
        // Initialize Eigen model for reactive part of the field
        cpp_weights = Python_Keras_get_weights("model_reactive");
        num_layers = cpp_weights.size() / 2;
        Eigen_model_reactive.weight_matrices.resize(num_layers); 
        Eigen_model_reactive.biases.resize(num_layers);
        for (size_t i = 0; i < cpp_weights.size(); i += 2) {
            size_t rows = cpp_weights[i][0].size();
            size_t cols = cpp_weights[i].size();
            Eigen_model_reactive.weight_matrices[i / 2].resize(rows, cols);
            size_t bias_size = cpp_weights[i + 1][0].size();
            Eigen_model_reactive.biases[i / 2].resize(bias_size);
        }
        update_weights(&Eigen_model_reactive, cpp_weights);
      }
    }
    MSG("AI: Surrogate model initialized");
  }
  R["TMP_PROPS"] = Rcpp::wrap(chem.getField().GetProps());
  R["field_nrow"] = chem.getField().GetRequestedVecSize();
    /* SIMULATION LOOP */
  double dSimTime{0};
  for (uint32_t iter = 1; iter < maxiter + 1; iter++) {
    double start_t = MPI_Wtime();
-
+    
    const double &dt = params.timesteps[iter - 1];
    //  cout << "CPP: Next time step is " << dt << "[s]" << endl;
@ -299,77 +376,72 @@ static Rcpp::List RunMasterLoop(RInsidePOET &R, const RuntimeParameters &params,
    chem.getField().update(diffusion.getField());
    MSG("Chemistry step");
    if (params.use_ai_surrogate) {
      double ai_start_t = MPI_Wtime();
-      // Save current values from the tug field as predictor for the ai step
+
      // Get current values from the tug field for the ai predictions
      R["TMP"] = Rcpp::wrap(chem.getField().AsVector());
-      R.parseEval(
+      R.parseEval(std::string("predictors <- ") + 
-          std::string("predictors <- setNames(data.frame(matrix(TMP, nrow=" +
+        "set_field(TMP, TMP_PROPS, field_nrow, ai_surrogate_species)");
                      std::to_string(chem.getField().GetRequestedVecSize()) +
                      ")), TMP_PROPS)"));
      R.parseEval("predictors <- predictors[ai_surrogate_species]");
      // Apply preprocessing
      MSG("AI Preprocessing");
      R.parseEval("predictors_scaled <- preprocess(predictors)");
      std::vector<std::vector<double>> predictors_scaled = R["predictors_scaled"];
-      // Predict
+      // Get K-Means cluster assignements based on the preprocessed data
-      MSG("AI Predict");
+      if (params.use_k_means_clustering) {
-      R.parseEval(
+        cluster_labels = K_Means(predictors_scaled, 2, 300);
-          "aipreds_scaled <- prediction_step(model, predictors_scaled)");
+        R["cluster_labels"] = cluster_labels;
      }
      MSG("AI: Predict");
      if (params.use_Keras_predictions) {  // Predict with Keras default function
        R["TMP"] = Python_Keras_predict(predictors_scaled, params.batch_size, cluster_labels);
      } else {  // Predict with custom Eigen function
        if (params.use_k_means_clustering) {
          R["TMP"] = Eigen_predict_clustered(Eigen_model, Eigen_model_reactive, 
                                             predictors_scaled, params.batch_size,
                                             &Eigen_model_mutex, cluster_labels);
        } else {
          R["TMP"] = Eigen_predict(Eigen_model, predictors_scaled, params.batch_size, 
                                   &Eigen_model_mutex);
        }
      }
      // Apply postprocessing
-      MSG("AI Postprocesing");
+      MSG("AI: Postprocesing");
-      R.parseEval("aipreds <- postprocess(aipreds_scaled)");
+      R.parseEval(std::string("predictions_scaled <- ") + 
        "set_field(TMP, ai_surrogate_species, field_nrow, ai_surrogate_species, byrow = TRUE)");
      R.parseEval("predictions <- postprocess(predictions_scaled)");
      // Validate prediction and write valid predictions to chem field
-      MSG("AI Validate");
+      MSG("AI: Validate");
-      R.parseEval(
+      R.parseEval("validity_vector <- validate_predictions(predictors, predictions)");
          "validity_vector <- validate_predictions(predictors, aipreds)");
-      MSG("AI Marking accepted");
+      MSG("AI: Marking valid");
      chem.set_ai_surrogate_validity_vector(R.parseEval("validity_vector"));
      MSG("AI TempField");
      std::vector<std::vector<double>> RTempField =
-          R.parseEval("set_valid_predictions(predictors,\
+        R.parseEval("set_valid_predictions(predictors, predictions, validity_vector)");
                                                                                       aipreds,\
                                                                                       validity_vector)");
      MSG("AI Set Field");
      Field predictions_field =
-          Field(R.parseEval("nrow(predictors)"), RTempField,
+          Field(R.parseEval("nrow(predictions)"), RTempField,
-                R.parseEval("colnames(predictors)"));
+                R.parseEval("colnames(predictions)"));
-      MSG("AI Update");
+      MSG("AI: Update field with AI predictions");
      chem.getField().update(predictions_field);
      // store time for output file
      double ai_end_t = MPI_Wtime();
      R["ai_prediction_time"] = ai_end_t - ai_start_t;
    }
    // Run simulation step
    MSG("Simulate chemistry");
    chem.simulate(dt);
    /* AI surrogate iterative training*/
    if (params.use_ai_surrogate) {
      double ai_start_t = MPI_Wtime();
      R["TMP"] = Rcpp::wrap(chem.getField().AsVector());
      R.parseEval(
          std::string("targets <- setNames(data.frame(matrix(TMP, nrow=" +
                      std::to_string(chem.getField().GetRequestedVecSize()) +
                      ")), TMP_PROPS)"));
      R.parseEval("targets <- targets[ai_surrogate_species]");
      // TODO: Check how to get the correct columns
      R.parseEval("target_scaled <- preprocess(targets)");
      MSG("AI: incremental training");
      R.parseEval("model <- training_step(model, predictors_scaled, "
                  "target_scaled, validity_vector)");
      double ai_end_t = MPI_Wtime();
      R["ai_training_time"] = ai_end_t - ai_start_t;
    }
    // MPI_Barrier(MPI_COMM_WORLD);
    double end_t = MPI_Wtime();
    dSimTime += end_t - start_t;
@ -381,6 +453,44 @@ static Rcpp::List RunMasterLoop(RInsidePOET &R, const RuntimeParameters &params,
    // store_result is TRUE)
    call_master_iter_end(R, diffusion.getField(), chem.getField());
    /* AI surrogate iterative training*/
    if (params.use_ai_surrogate && !params.disable_training) {
      MSG("AI: Add to training data buffer");
      if (!params.train_only_invalid) {
        // Use all values if not specified otherwise 
        R.parseEval("validity_vector <- rep(0, length(validity_vector))");
      }
      std::vector<std::vector<double>> invalid_x = 
        R.parseEval("get_invalid_values(predictors_scaled, validity_vector)");
      R.parseEval("target_scaled <- preprocess(state_C[ai_surrogate_species])");
      std::vector<std::vector<double>> invalid_y = 
        R.parseEval("get_invalid_values(target_scaled, validity_vector)");
      training_data_buffer_mutex.lock();
      training_data_buffer_append(training_data_buffer.x, invalid_x);
      training_data_buffer_append(training_data_buffer.y, invalid_y);
      // If clustering is used, Add cluster labels to buffer and
      // count buffer size according to the cluster assignements
      int n_cluster_reactive = 0;
      size_t  buffer_size = training_data_buffer.x[0].size();
      if (params.use_k_means_clustering) {
        cluster_labels_append(training_data_buffer.cluster_labels, cluster_labels,
                              R["validity_vector"]);
        for (size_t i = 0; i < buffer_size; i++) {
          n_cluster_reactive += training_data_buffer.cluster_labels[i];
        }
      }
      // Signal to the training thread if enough training data is present
      if ((n_cluster_reactive >= params.training_data_size) ||
         (buffer_size - n_cluster_reactive >= params.training_data_size)) {
        start_training = true;
        training_data_buffer_full.notify_one();
      }
      training_data_buffer_mutex.unlock();
    }
    diffusion.getField().update(chem.getField());
    MSG("End of *coupling* iteration " + std::to_string(iter) + "/" +
@ -426,6 +536,12 @@ static Rcpp::List RunMasterLoop(RInsidePOET &R, const RuntimeParameters &params,
  profiling["chemistry"] = chem_profiling;
  profiling["diffusion"] = diffusion_profiling;
  if (params.use_ai_surrogate) {
    MSG("Finalize Python and wind down training thread");
    Python_finalize(&Eigen_model_mutex, &training_data_buffer_mutex,
                    &training_data_buffer_full, &start_training, &end_training);
  }
  chem.MasterLoopBreak();
  return profiling;
@ -477,7 +593,11 @@ std::vector<std::string> getSpeciesNames(const Field &&field, int root,
 int main(int argc, char *argv[]) {
  int world_size;
-  MPI_Init(&argc, &argv);
+  // Threadsafe MPI is necessary for the AI surrogate
  // training thread
  int provided;
  int required = MPI_THREAD_FUNNELED;
  MPI_Init_thread(&argc, &argv, required, &provided);
  {
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);
@ -548,8 +668,15 @@ int main(int argc, char *argv[]) {
      R["out_ext"] = run_params.out_ext;
      R["out_dir"] = run_params.out_dir;
      // MPI_Barrier(MPI_COMM_WORLD);
      DiffusionModule diffusion(init_list.getDiffusionInit(),
                                init_list.getInitialGrid());
      chemistry.masterSetField(init_list.getInitialGrid());
      if (run_params.use_ai_surrogate) {
-        /* Incorporate ai surrogate from R */
+        // Load default function implementations 
        R.parseEvalQ(ai_surrogate_r_library);
        /* Use dht species for model input and output */
        R["ai_surrogate_species"] =
@ -558,23 +685,53 @@ int main(int argc, char *argv[]) {
        const std::string ai_surrogate_input_script =
            init_list.getChemistryInit().ai_surrogate_input_script;
-        MSG("AI: sourcing user-provided script");
+        MSG("AI: Sourcing user-provided script");
        R.parseEvalQ(ai_surrogate_input_script);
        if (!Rcpp::as<bool>(R.parseEval("exists(\"model_file_path\")"))) {
          throw std::runtime_error("AI surrogate input script must contain a value for model_file_path");
        }
-        MSG("AI: initialize AI model");
+        /* AI surrogate training and inference parameters. (Can be set by declaring a 
-        R.parseEval("model <- initiate_model()");
+        variable of the same name in one of the the R input scripts)*/
-        R.parseEval("gpu_info()");
+        run_params.training_data_size = init_list.getDiffusionInit().n_rows *
                                        init_list.getDiffusionInit().n_cols; // Default value is number of cells in field
        if (Rcpp::as<bool>(R.parseEval("exists(\"batch_size\")"))) {
          run_params.batch_size = R["batch_size"];
        }
        if (Rcpp::as<bool>(R.parseEval("exists(\"training_epochs\")"))) {
          run_params.training_epochs = R["training_epochs"];
        }
        if (Rcpp::as<bool>(R.parseEval("exists(\"training_data_size\")"))) {
          run_params.training_data_size = R["training_data_size"];
        }
        if (Rcpp::as<bool>(R.parseEval("exists(\"use_Keras_predictions\")"))) {
          run_params.use_Keras_predictions = R["use_Keras_predictions"];
        }
        if (Rcpp::as<bool>(R.parseEval("exists(\"disable_training\")"))) {
          run_params.disable_training = R["disable_training"];
        }
        if (Rcpp::as<bool>(R.parseEval("exists(\"save_model_path\")"))) {
          run_params.save_model_path = Rcpp::as<std::string>(R["save_model_path"]);
          MSG("AI: Model will be saved as \"" + run_params.save_model_path + "\"");
        }
        if (Rcpp::as<bool>(R.parseEval("exists(\"use_k_means_clustering\")"))) {
          run_params.use_k_means_clustering = R["use_k_means_clustering"];
          MSG("K-Means clustering will be used for the AI surrogate")
        }
        if (Rcpp::as<bool>(R.parseEval("exists(\"train_only_invalid\")"))) {
          run_params.train_only_invalid = R["train_only_invalid"];
        }
        if (!Rcpp::as<bool>(R.parseEval("exists(\"model_reactive_file_path\")"))) {
          R.parseEval("model_reactive_file_path <- model_file_path");
        }
        MSG("AI: Initialize Python for the AI surrogate functions");
        std::string python_keras_file = std::string(SRC_DIR) +
          "/src/Chemistry/SurrogateModels/AI_Python_functions/keras_AI_surrogate.py";
        Python_Keras_setup(python_keras_file, run_params.cuda_src_dir);
      }
      MSG("Init done on process with rank " + std::to_string(MY_RANK));
      // MPI_Barrier(MPI_COMM_WORLD);
      DiffusionModule diffusion(init_list.getDiffusionInit(),
                                init_list.getInitialGrid());
      chemistry.masterSetField(init_list.getInitialGrid());
      Rcpp::List profiling = RunMasterLoop(R, run_params, diffusion, chemistry);
      MSG("finished simulation loop");
--- a/src/poet.hpp.in
+++ b/src/poet.hpp.in
@ -29,11 +29,13 @@
 #include <Rcpp.h>
 #define SRC_DIR "@CMAKE_SOURCE_DIR@"
 #define CUDA_SRC_DIR "@CUDA_TOOLKIT_ROOT_DIR@"
 static const char *poet_version = "@POET_VERSION@";
 // using the Raw string literal to avoid escaping the quotes
 static const inline std::string kin_r_library = R"(@R_KIN_LIB@)";
 static const inline std::string init_r_library = R"(@R_INIT_LIB@)";
 static const inline std::string ai_surrogate_r_library =
    R"(@R_AI_SURROGATE_LIB@)";
@ -67,5 +69,15 @@ struct RuntimeParameters {
  static constexpr std::uint32_t INTERP_BUCKET_ENTRIES_DEFAULT = 20;
  std::uint32_t interp_bucket_entries;
-  bool use_ai_surrogate = false;
+  /*AI surriogate configuration*/
  bool use_ai_surrogate = false; // Can be set with command line flag ---ai-surrogate
  bool disable_training = false; // Can be set in the R input script
  bool use_k_means_clustering = false; // Can be set in the R input script
  bool use_Keras_predictions = false; // Can be set in the R input script
  bool train_only_invalid = false; // Can be set in the R input script
  int batch_size = 2560; // default value determined in test on the UP Turing cluster
  int training_epochs = 20;; // Can be set in the R input script
  int training_data_size; // Can be set in the R input script
  std::string save_model_path = ""; // Can be set in the R input script
  std::string cuda_src_dir = CUDA_SRC_DIR; // From CMake
 };
Author	SHA1	Message	Date
straile	81723f81f8	feat: manipulate ai_surrogate_species	2024-11-02 17:08:22 +01:00
straile	4d5a7aadfb	test: remove species for ai	2024-11-01 19:56:00 +01:00
straile	b32927cff0	refactor: be more consistent with int and size_t	2024-10-31 19:56:14 +01:00
straile	9091117e67	docs: add k-means user script variables to readme	2024-10-30 16:51:20 +01:00
straile	8062e7528b	docs: add k-means user script variables to readme	2024-10-30 16:48:00 +01:00
straile	bf5501867b	fix: typo in function stub for compilation without -DUSE_AI_SURROGATE	2024-10-27 22:25:14 +01:00
straile	062cdb5256	feat: K-Means option fully implemented	2024-10-27 18:12:32 +01:00
straile	361b34d11d	feat: input script option to use k means, fill training data buffer accordingly	2024-10-26 13:33:01 +02:00
straile	b4d093d205	feat: cluster labels in training data buffer	2024-10-25 18:58:31 +02:00
straile	51b3608b68	feat: C++ kmeans	2024-10-25 13:20:01 +02:00
straile	2f0b84bb3e	fix: lock mutex before signal to end program	2024-10-23 18:25:32 +02:00
straile	4d254250e1	fix: set training wait predicate with buffer threshold check	2024-10-20 12:38:58 +02:00
straile	997ae32092	refactor: pre/postprocess as separate functions	2024-10-19 18:44:44 +02:00
straile	f746a566cc	timekeeping & a lot else	2024-10-19 18:29:41 +02:00
straile	a1c954df43	refactor: get training data targets directly from state_C	2024-10-19 14:43:31 +02:00
straile	29858bb6d5	fix: mutex around 'start_training' when filling training buffer	2024-10-18 17:51:38 +02:00
straile	50f820dc94	docs: CUDA found message	2024-10-17 10:51:49 +02:00
straile	db36a99462	fix: cuda not required	2024-10-17 10:21:45 +02:00
straile	589773731a	feat: ai surrogate inference time keeping	2024-10-16 15:59:04 +02:00
straile	7ae203c7dc	feat: disable training via input script	2024-10-15 15:35:27 +02:00
straile	7925534b5e	fix: Python_Finalize() no longer potentially blocking	2024-10-15 13:07:28 +02:00
straile	b127fbe7b3	fix: tenacious double free error at end of program	2024-10-15 12:27:41 +02:00
straile	a4a1eedcac	fWF	2024-10-15 11:36:12 +02:00
straile	8691370abb	Eigen model works	2024-10-15 11:17:30 +02:00
straile	e99a114bc3	revert all	2024-10-14 20:28:57 +02:00
straile	c323705f34	what is the problem?	2024-10-14 20:03:14 +02:00
straile	74cd827c68	what is the problem?	2024-10-14 19:58:34 +02:00
hans	5bfb95c2fc	feat: training data buffer size defaults to field size	2024-10-12 17:58:51 +02:00
hans	a289fc7790	feat: option to save current ai model after training	2024-10-12 16:04:24 +02:00
straile	0017a20e82	docs: update descriptions of ai surrogate options and parameters	2024-10-11 12:34:43 +02:00
straile	f7d3a7ea65	fix: function stubs for compilation without AI surrogate	2024-10-11 12:12:38 +02:00
straile	84c86a85f5	Feat: AI Surrogate inference (with Keras or C++) and parallel training	2024-10-11 11:25:00 +02:00
straile	c2535b03a7	Feat: Conditional waiting and training data buffer	2024-10-10 20:11:52 +02:00
straile	0bf51d0f02	Fix: Eigen double free	2024-10-10 14:17:28 +02:00
straile	3464ada1f1	Docs: Description of Buid flag	2024-10-09 17:33:12 +02:00
straile	d839ae4d5e	Validate predictions	2024-10-09 17:23:14 +02:00
straile	e2d96ca9b6	Feat: Build flag for AI surrogate	2024-10-09 15:25:03 +02:00
straile	a0fe891f99	Eigen inference but slow	2024-10-07 22:01:29 +02:00
straile	394e7caa49	Python prediction	2024-10-07 13:40:43 +02:00
straile	c0456e2f14	Removing AI surrogate R lib	2024-10-07 10:01:37 +02:00
straile	b4b4d76c74	Include Python.h	2024-10-07 09:50:42 +02:00