Orchestration using Recipes

Using Recipes, Bee allows to attach several WEASEL runs (Workflows) to a Molecule where some Workflows are started using Conformers that were generated by previous Workflows of the same Molecule.

So far, there are two Recipe types implemented:

$ bee recipe types -w
aiqu-large      ai.qu Recipe for large molecules
    aiqu_confsearch
    aiqu_nms
    aiqu_sp_water
    aiqu_sp_acetonitrile
    aiqu_sp_chloroform
    aiqu_sp_benzene
    aiqu_sp_excitedstate
aiqu-small      ai.qu Recipe for small molecules
    aiqu_confsearch2
    aiqu_nms2
    aiqu_sp_water
    aiqu_sp_acetonitrile
    aiqu_sp_chloroform
    aiqu_sp_benzene
    aiqu_sp_excitedstate

Bulk adding Molecules

In order to add Molecules to the database via the command-line client, one can use the command:

$ bee mol add -h
Usage: bee molecule add [OPTIONS] [MOLECULE]

  Bulk adding molecule(s) to database

  For batch input (stdin or file) use the following format (tab separated)

  <SMILES>    <MULTIPLICITY>  <TAGS>  <IDENTIFIERS>   <RECIPES>

  COCl     1  foo,bar     id_1,id3,id4    aiqu-large,aiqu-small

Options:
  -c, --stdin                     Read molecules from stdin
  -f, --file FILE                 Read molecules from file
  -i, --identifier TEXT           Identifiers
  -m, --multiplicity INTEGER      Multiplicity  [default: 1]
  -r, --recipe [aiqu-large|aiqu-small]
                                  Recipes for Molecule  [default: aiqu-large]
  -t, --tag TEXT                  Tags
  -h, --help                      Show this message and exit.

Whilst for testing, it can be used to add just a single Molecule given by its SMILES string as argument, but usually it retrieves its input in bulk via STDIN or a tab seperated file. For each molecule, it also creates a Recipe in the database.

Here is an example run:

Running a Recipe

The CLI command bee recipe overview gives a summary of a Recipe's Workflows along with their states:

$ bee recipe overview 2
Recipe 1 (aiqu-large) of Molecule 2: READY
    aiqu_confsearch          [    ]: READY      Conformers: 2
    aiqu_nms                 [    ]: PENDING    Conformers:
    aiqu_sp_water            [    ]: PENDING    Conformers:
    aiqu_sp_acetonitrile     [    ]: PENDING    Conformers:
    aiqu_sp_chloroform       [    ]: PENDING    Conformers:
    aiqu_sp_benzene          [    ]: PENDING    Conformers:
    aiqu_sp_excitedstate     [    ]: PENDING    Conformers:

The command bee recipe run will execute all Weasel workflows serially while resolving any dependencies:

Filtering Recipes by state

The API supports filtering Recipes by state, e.g.:

$ bee recipe list --state FINISHED
|  id |    type    | job_id |  state   | tags | molecule_id |        workflow_ids       |
+-----+------------+--------+----------+------+-------------+---------------------------+
| 101 | aiqu-large |        | FINISHED |      |     101     |    2, 3, 4, 6, 7, 8, 9    |
| 102 | aiqu-large |        | FINISHED |      |     102     | 5, 10, 11, 12, 13, 14, 15 |

This way, one can easily obtain the Recipe ids which are ready to run:

$ bee recipe list --ids --state READY
1
2
3
...
98
99

and queue them by a scheduler, e.g. in case of slurm:

$ for id in $(bee recipe list --ids --state READY); do
>    sbatch bee recipe run --project-dir $PWD $id
> done