Automated Model Paralellization

JAMS includes a preprocessing option which allows to automatically modify a model prior to its execution. Similar like a standard JAMS model, the so-called JAMS preprocessors are organized as a set of components. However, while components of a JAMS model serve to process data and simulate environmental or anthropogenic processes, the components of a preprocessor allow to apply any kind of automatic modification to the model data and structure itself, i.e. they can be seen as a meta-model. Preprocessors are basically normal JAMS components which are run in a pre-processing step. Since each preprocessor is realated to a specific JAMS model, the preprocessor definition is part of a JAMS model.#

Figure 1: Pre-processor definition in a JAMS model

While preprocessors can be used for all kinds of model modifications, their most common and primary use case is the automated parallelization of a model. Figure 1 shows an example of such a preprocessor. Here, JAMS tells the ConcurrentContextProcessor (marked by “1”) to be applied on the HRULoop (marked by “2”). It will basically do two things:

The overall set of HRU will be split into n subsets, where n is the number of the computer’s processor cores. These subsets are chosen such that they are spatially independent, i.e. there are no interactions between HRUs from two different subsets. This “segmentation” is done by a seperate class which is given as a parameter, EntityPartitioner in this case (marked by “3”). EntityPartitioner will do two things:
- It will generate all independen HRU subsets by identifying all HRUs that drain to a reach and then collect their respective sub-basins, i.e. all upstream HRUs.
- Then, these subsets are joined into n groups of equal size.
Another segmentation component is SubbasinPartitioner which can be applied for multi-routing models and which makes use of the “subbasin” attribute generated for each HRU by WebHRU.
Once the n HRU subsets were created, the modification of the model structure starts. The ConcurrentContextProcessor will create n copies of the HRULoop with all of its components inside. Each of these n HRULoops will be assigned one of the n HRU subsets created in the last step. In addition, the n HRULoops will be put into a new “parent” context which allows to run its children in parallel. So during simulation, these n HRULoops will run at the same time within the current timestep. Once all are finished, the model moves on to the next time step. Since the time is as long as the longest running HRULoop, the HRU subsets need to be as equal in size as possible.
During the duplication of HRULoops that are run in parallel, a few additional things need to be taken care of. The SpatialWeightedSumAggregators e.g. are summing up HRU values at a given time step. In order to avoid that two or more parallel SpatialWeightedSumAggregators interfere by writing to the same target variable (thereby overwriting each others results), they have to be excluded from the parallel execution. These exlusions are defined by the “exclude_component” attribute (marked by “4”), which in this case lists also the routing components and the AreaFraction components.

Once the parameters have been set for a specific model, this approach works for any region without manual interaction by the modeller. Unfortunately, due to the fact that only part of the model runs in parallel, the efficiency (i.e. the so-called Speedup) of the approach is not perfect. In a typical application, the model simulation time will be about 50% of the standard model when using 4 processor cores. In this case, the Speedup is 2 while the ideal Speedup would be 4. In order to provide control over the amount of processor cores used by the model, the percentage used can be controlled by a scale_factor (marked by “5”, 1 means 100%).

Pre-processing of models can be switched off in the global JAMS configuration.