poet/README.md
2024-05-06 10:09:28 +00:00

9.0 KiB

POET

POET is a coupled reactive transport simulator implementing a parallel architecture and a fast, original MPI-based Distributed Hash Table.

POET's Coupling Scheme

Parsed code documentiation

A parsed version of POET's documentiation can be found at Gitlab pages.

External Libraries

The following external header library is shipped with POET:

Installation

Requirements

To compile POET you need several software to be installed:

  • C/C++ compiler (tested with GCC)
  • MPI-Implementation (tested with OpenMPI and MVAPICH)
  • R language and environment
  • CMake 3.9+
  • Eigen3 3.4+ (required by tug)
  • optional: doxygen with dot bindings for documentiation

The following R libraries must then be installed, which will get the needed dependencies automatically:

Compiling source code

The generation of makefiles is done with CMake. You should be able to generate Makefiles by running:

mkdir build && cd build
cmake ..

This will create the directory build and processes the CMake files and generate Makefiles from it. You're now able to run make to start build process.

If everything went well you'll find the executable at build/app/poet, but it is recommended to install the POET project structure to a desired CMAKE_INSTALL_PREFIX with make install.

During the generation of Makefiles, various options can be specified via cmake -D <option>=<value> [...]. Currently, there are the following available options:

  • POET_DHT_Debug=boolean - toggles the output of detailed statistics about DHT usage. Defaults to OFF.
  • POET_ENABLE_TESTING=boolean - enables small set of unit tests (more to come). Defaults to OFF.
  • POET_PHT_ADDITIONAL_INFO=boolean - enabling the count of accesses to one PHT bucket. Use with caution, as things will get slowed down significantly. Defaults to OFF.

Example: Build from scratch

Assuming that only the C/C++ compiler, MPI libraries, R runtime environment and CMake have been installed, POET can be installed as follows:

# start R environment
$ R

# install R dependencies
> install.packages(c("Rcpp", "RInside"))
> q(save="no")

# cd into POET project root
$ cd <POET_dir>

# Build process
$ mkdir build && cd build
$ cmake -DCMAKE_INSTALL_PREFIX=/home/<user>/poet ..
$ make -j<max_numprocs>
$ make install

This will install a POET project structure into /home/<user>/poet which is called hereinafter <POET_INSTALL_DIR>. With this version of POET we do not recommend to install to hierarchies like /usr/local/ etc.

The correspondending directory tree would look like this:

poet
├── bin
│   ├── poet
│   └── poet_init
└── share
    └── poet
        ├── barite
        │   ├── barite_200.rds
        │   ├── barite_200_rt.R
        │   ├── barite_het.rds
        │   └── barite_het_rt.R
        ├── dolo
        │   ├── dolo_inner_large.rds
        │   ├── dolo_inner_large_rt.R
        │   ├── dolo_interp.rds
        │   └── dolo_interp_rt.R
        └── surfex
            ├── PoetEGU_surfex_500.rds
            └── PoetEGU_surfex_500_rt.R

With the installation of POET, two executables are provided:

  • poet - the main executable to run simulations
  • poet_init - a preprocessor to generate input files for POET from R scripts

Preprocessed benchmarks can be found in the share/poet directory with an according runtime setup. More on those files and how to create them later.

Running

Run POET by mpirun ./poet [OPTIONS] <RUNFILE> <SIMFILE> <OUTPUT_DIRECTORY> where:

  • OPTIONS - POET options (explained below)
  • RUNFILE - Runtime parameters described as R script
  • SIMFILE - Simulation input prepared by poet_init
  • OUTPUT_DIRECTORY - path, where all output of POET should be stored

POET options

The following parameters can be set:

Option Value Description
--work-package-size= 1..n size of work packages (defaults to 5)
-P, --progress show progress bar
--dht enabling DHT usage (defaults to OFF)
--dht-strategy= 0-1 change DHT strategy. NOT IMPLEMENTED YET (Defaults to 0)
--dht-size= 1-n size of DHT per process involved in megabyte (defaults to 1000 MByte)
--dht-snaps= 0-2 disable or enable storage of DHT snapshots
--dht-file= <SNAPSHOT> initializes DHT with the given snapshot file
--interp-size 1-n size of PHT (interpolation) per process in megabyte
--interp-bucket-entries 1-n number of entries to store at maximum in one PHT bucket
--interp-min 1-n number of entries in PHT bucket needed to start interpolation

Additions to dht-snaps

Following values can be set:

  • 0 = snapshots are disabled
  • 1 = only stores snapshot at the end of the simulation with name <OUTPUT_DIRECTORY>.dht
  • 2 = stores snapshot at the end and after each iteration iteration snapshot files are stored in <DIRECTORY>/iter<n>.dht

Example: Running from scratch

We will continue the above example and start a simulation with barite_het, which simulation files can be found in <INSTALL_DIR>/share/poet/barite/barite_het*. As transport a heterogeneous diffusion is used. It's a small 2D grid, 2x5 grid, simulating 50 time steps with a time step size of 100 seconds. To start the simulation with 4 processes cd into your previously installed POET-dir <POET_INSTALL_DIR>/bin and run:

cp ../share/poet/barite/barite_het* .
mpirun -n 4 ./poet barite_het_rt.R barite_het.rds output

After a finished simulation all data generated by POET will be found in the directory output.

You might want to use the DHT to cache previously simulated data and reuse them in further time-steps. Just append --dht to the options of POET to activate the usage of the DHT. Also, after each iteration a DHT snapshot shall be produced. This is done by appending the --dht-snaps=<value> option. The resulting call would look like this:

mpirun -n 4 ./poet --dht --dht-snaps=2 barite_het_rt.R barite_het.rds output

Defining a model

In order to provide a model to POET, you need to setup a R script which can then be used by poet_init to generate the simulation input. Which parameters are required can be found in the Wiki. We try to keep the document up-to-date. However, if you encounter missing information or need help, please get in touch with us via the issue tracker or E-Mail.

poet_init can be used as follows:

./poet_init [-o, --output output_file] [-s, --setwd]  <script.R>

where:

  • output - name of the output file (defaults to the input file name with the extension .rds)
  • setwd - set the working directory to the directory of the input file (e.g. to allow relative paths in the input script). However, the output file will be stored in the directory from which poet_init was called.

About the usage of MPI_Wtime()

Implemented time measurement functions uses MPI_Wtime(). Some important information from the OpenMPI Man Page:

For example, on platforms that support it, the clock_gettime() function will be used to obtain a monotonic clock value with whatever precision is supported on that platform (e.g., nanoseconds).