Parallel calculations

WEASEL supports native parallelization, i.e. it does not only make use of ORCA's own parallelization, but it can coordinate several ORCA runs in parallel. Depending on the available hardware it exploits:

  • one or more CPU cores

  • single or multiple machines

  • queuing system (e.g. Slurm) or standalone

Specifying available resources

WEASEL usually determines the available resources, i.e. number of cores and available memory, from the defaults settings file (see here for details):

[HARDWARE]
Cores = 2
Memory = 2500

Those can be overwritten by command line arguments, e.g.

weasel structure.mol2 -cores 16         # use 16 cores
weasel structure.mol2 -mem 128          # use 128 MB per core
weasel structure.mol2 -mem-total 1024   # use 1024 MB in total

Note

By default WEASEL expect memory to be provided in MB. For the sake of simplicity, memory can also be specified using unit suffixes: B, K, M, G, and T corresponding to byte, kilobyte, megabyte, gigabyte and terabyte, e.g. weasel structure.mol2 -mem 128G. The unit suffix are in interpreted in powers of 1000, i.e. 1000M = 1G.

If neither of the above mentioned options or keywords is used, WEASEL will eventually consume all physically available cores and memory.

Note

If WEASEL is submitted to a queuing system such as Slurm, then resources are retrieved from the queuing system. Currently, WEASEL supports all queuing systems supported by Open MPI, that includes Slurm, PBS/Torque, GridEngine and several more.

Hyper-threading

In general WEASEL and some of the programs it uses, most notably ORCA and XTB, do support hyper-threading. Though, from our experience hyper-threading introduces a significant performance penalty, therefore, it is disabled by default. WEASEL will also utter a warning if it detects usage of hyper-threading. Nevertheless, hyper-threading can be turned on using following keyword in your settings file:

[HARDWARE]
# Use 'False' to turn it off.
Use_Hyperthreading = True

The RSH command

For the parallelization to work the RSH_COMMAND option—the command by which WEASEL invokes itself on slave nodes—has to be set in the settings files:

[SOFTWARE]
# Set rsh command for parallel Weasel runs on clusters
RSH_COMMAND=ssh -x
# Set rsh connection timeout for parallel Weasel runs on clusters (unit: seconds) [default: 15]
RSH_TIMEOUT=15

By default WEASEL uses SSH to connect to remote machines. Depending on the underlying hardware setup this options has to be adjusted accordingly.

Note

Note, the RSH_COMMAND can also be set via an environmental variable with the same name and it has a higher precedence than the respective option in the settings files (shown above).

Note

The RSH_TIMEOUT options determines how long WEASEL waits for a response from a remote process. Depending on the workflow and the type of calculation WEASEL will retry or abort the current step. Note, the timeout only has influence on the data exchange between the remote process and the main process and not on the duration of the calculation.

Multiple nodes

For running calculations on more than one machine (e.g. on a HPC cluster), WEASEL provides a simple interface via hostfiles or - if installed - via a queueing system.

Note

WEASEL does not support calculations on a shared filesystem. Hence, the user has to start the calculation on the headnode's local filesystem and ensure that the very same absolute path does not clash on any other participating nodes. However, WEASEL will try to take care of creating and cleaning up this directory on the other nodes.

Hostfile

The number of cores for multiple nodes can be supplied via command line argument -hostfile:

weasel structure.mol2 -hostfile <HOSTFILE>

The format of this file follows simple version the syntax used by Open MPI. Each line consists of exactly one hostname. The number of lines of with the same hostname corresponds to the number of CPU cores used on that hosts. The order of the hostnames does not matter. Comments or blank lines are not allowed. For a setup with 3 cores on node1 and 2 cores on node2 a hostfile would look as follows:

1node1
2node1
3node1
4node2
5node2

Note

The command line option -cores overwrites the total number of cores obtained from the hostfile but it may not exceed it.

Queuing Systems: Submission script for Slurm

For convenience and as an example, the WEASEL package contains a minimal submission script for Slurm: subweasel. It accomplishes most of the tasks required for running a WEASEL job in a Slurm cluster:

  • Supplying a suitable job script for the job allocation.

  • Creating a scratch directory on a local filesystem.

  • Copying the results back to the base directory of the working dir.

Additional arguments to the sbatch call can be supplied via -s. For example, allocating 16 cores and running WEASEL with 128MB per core:

subweasel -s "--mem-per-cpu=128 -n 16" -- structure.mol2

See subweasel -h for a complete list of options, as well as sbatch -h.

Note

subweasel expects all input files (e.g. structure.mol2) to be on a shared filesystem.