.. _parallel: Parallel calculations ===================== WEASEL supports native parallelization, i.e. it does not only make use of ORCA's own parallelization, but it can coordinate several ORCA runs in parallel. Depending on the available hardware it exploits: - one or more CPU cores - single or multiple machines - queuing system (e.g. Slurm) or standalone Specifying available resources ------------------------------ WEASEL usually determines the available resources, i.e. number of cores and available memory, from the defaults settings file (see :ref:`here` for details):: [HARDWARE] Cores = 2 Memory = 2500 Those can be overwritten by command line arguments, e.g. .. prompt:: bash $ weasel structure.mol2 -cores 16 # use 16 cores weasel structure.mol2 -mem 128 # use 128 MB per core weasel structure.mol2 -mem-total 1024 # use 1024 MB in total .. note:: By default WEASEL expect memory to be provided in MB. For the sake of simplicity, memory can also be specified using unit suffixes: B, K, M, G, and T corresponding to byte, kilobyte, megabyte, gigabyte and terabyte, e.g. ``weasel structure.mol2 -mem 128G``. The unit suffix are in interpreted in powers of 1000, i.e. 1000M = 1G. If neither of the above mentioned options or keywords is used, WEASEL will eventually consume all physically available cores and memory. .. note:: If WEASEL is submitted to a queuing system such as Slurm, then resources are retrieved from the queuing system. Currently, WEASEL supports all queuing systems supported by Open MPI, that includes Slurm, PBS/Torque, GridEngine and several more. Hyper-threading ............... In general WEASEL and some of the programs it uses, most notably ORCA and XTB, do support hyper-threading. Though, from our experience hyper-threading introduces a significant performance penalty, therefore, it is disabled by default. WEASEL will also utter a warning if it detects usage of hyper-threading. Nevertheless, hyper-threading can be turned on using following keyword in your settings file: :: [HARDWARE] # Use 'False' to turn it off. Use_Hyperthreading = True The RSH command ............... For the parallelization to work the :code:`RSH_COMMAND` option—the command by which WEASEL invokes itself on slave nodes—has to be set in the settings files:: [SOFTWARE] # Set rsh command for parallel Weasel runs on clusters RSH_COMMAND=ssh -x # Set rsh connection timeout for parallel Weasel runs on clusters (unit: seconds) [default: 15] RSH_TIMEOUT=15 By default WEASEL uses SSH to connect to remote machines. Depending on the underlying hardware setup this options has to be adjusted accordingly. .. note:: Note, the :code:`RSH_COMMAND` can also be set via an environmental variable with the same name and it has a higher precedence than the respective option in the settings files (shown above). .. note:: The :code:`RSH_TIMEOUT` options determines how long WEASEL waits for a response from a remote process. Depending on the workflow and the type of calculation WEASEL will retry or abort the current step. Note, the timeout only has influence on the data exchange between the remote process and the main process and not on the duration of the calculation. Multiple nodes -------------- For running calculations on more than one machine (e.g. on a HPC cluster), WEASEL provides a simple interface via *hostfiles* or - if installed - via a queueing system. .. note:: WEASEL does not support calculations on a shared filesystem. Hence, the user has to start the calculation on the headnode's local filesystem and ensure that the very same absolute path does not clash on any other participating nodes. However, WEASEL will try to take care of creating and cleaning up this directory on the other nodes. Hostfile ........ The number of cores for multiple nodes can be supplied via command line argument ``-hostfile``: .. prompt:: bash $ weasel structure.mol2 -hostfile The format of this file follows simple version the syntax used by `Open MPI `_. Each line consists of exactly one hostname. The number of lines of with the same hostname corresponds to the number of CPU cores used on that hosts. The order of the hostnames does not matter. Comments or blank lines are not allowed. For a setup with 3 cores on ``node1`` and 2 cores on ``node2`` a hostfile would look as follows: .. code-block:: :linenos: node1 node1 node1 node2 node2 .. note:: The command line option ``-cores`` overwrites the total number of cores obtained from the *hostfile* but it may not exceed it. Queuing Systems: Submission script for Slurm ............................................ For convenience and as an example, the WEASEL package contains a minimal submission script for Slurm: ``subweasel``. It accomplishes most of the tasks required for running a WEASEL job in a Slurm cluster: - Supplying a suitable job script for the job allocation. - Creating a scratch directory on a local filesystem. - Copying the results back to the base directory of the working dir. Additional arguments to the ``sbatch`` call can be supplied via ``-s``. For example, allocating 16 cores and running WEASEL with 128MB per core: .. prompt:: bash $ subweasel -s "--mem-per-cpu=128 -n 16" -- structure.mol2 See ``subweasel -h`` for a complete list of options, as well as ``sbatch -h``. .. note:: ``subweasel`` expects all input files (e.g. ``structure.mol2``) to be on a shared filesystem.