Summary file

In WEASEL, during a simulation, all important calculated results are collected and stored in a summary file located in the mainjob directory. This summary file is formatted as a tab-separated values (tsv) file, where each line represents a result,and the details of the simulation job are stored as tags associated with each result.

Let's consider the example of a simple water simulation discussed in the previous chapter. In this case, the summary file will contain the results of three specific calculations: preoptimization (PreOpt), optimization (Opt), and DFT single point energy (SP_DFT).

The summary file is structured such that each line corresponds to one specific property from a calculation, and the various calculated properties and corresponding meta information are organized in columns. The columns are separated by tabs to maintain a structured format. The first line contains a timestamp unique to the WEASEL job, as well as the full command line call. The second line contains the column headers and starting from the third line the calculated properties are presented.

12023-05-01T14:32:25.597453 -- Summary file for job: /usr/local/bin/weasel h2o.xyz
2Energy [kcal/mol] / Value       Type    Calculation type        Method  Basis set       Solvent Charge  Multiplicity    Further tags
3-3190.904223    SP-Energy       PreOpt  XTB     None    ALPB(H2O)       0       1
4-47959.702289   SP-Energy       Opt     r2SCAN-3c       None    CPCM(Water)     0       1
5-47968.012939   SP-Energy       SP_DFT  wB97X-V def2-TZVP       CPCM(Water)     0       1

Note

The summary file is mainly meant to present the most important properties in an easy machine-readable format. Therefore, we decided to use the tsv format, as it allows to store large amounts data well structure, while also maintaining human readability.

Important

In order to allow extensions of the summary file format in the future, custom scripts/tools that read the summary file should access columns using the headers defined in the second row rather than just the id of the column.

In the tutorials that follow, the summary file will be presented in a fixed format and the first line will be omitted to improve readability.

Energy [kcal/mol] / Value  Type        Calculation type  Method      Basis set   Solvent     Charge   Multiplicity Further tags
-3190.904223               SP-Energy   PreOpt            XTB         None        ALPB(H2O)   0        1
-47959.702289              SP-Energy   Opt               r2SCAN-3c   None        CPCM(Water) 0        1
-47968.012939              SP-Energy   SP_DFT            wB97X-V     def2-TZVP   CPCM(Water) 0        1

This modified format used in the tutorials retains the essence of the original summary file format, but is adapted to enhance the clarity of the concepts discussed below.

Note

A copy of any reporting file, i.e. report, summary, warning and error files, can be written to a second path during runtime using the argument -reportdir /second/path. This can be useful if many jobs are carried out on a remote cluster.

We will look at the different summary outputs for individual workflows later. But for now, let us look briefly at an example of a command line call for the known case of water to see what an extended summary file might look like. In the given example, first, the frequencies and corresponding free energy values are calculated using the -freq option. Following that, an energy recalculation is performed at the DLPNO-CCSD(T) level using the -spwf option. In this case, the molecule being studied is once more water:

$ weasel water.xyz -freq -spwf

This calcualtion results in a summary file containing the following information:

Energy [kcal/mol] / Value  Type           Calculation type  Method         Basis set   Solvent     Charge  Multiplicity Further tags
-3190.904223               SP-Energy      PreOpt            XTB            None        ALPB(H2O)   0       1
-47959.702289              SP-Energy      Opt               r2SCAN-3c      None        CPCM(Water) 0       1
 15.679222                 SP-Energy      H_correction      r2SCAN-3c      None        CPCM(Water) 0       1
-13.450377                 SP-Energy      -TS_correction    r2SCAN-3c      None        CPCM(Water) 0       1
 2.228845                  SP-Energy      G_correction      r2SCAN-3c      None        CPCM(Water) 0       1
-47968.012940              SP-Energy      SP_DFT            wB97X-V        def2-TZVP   CPCM(Water) 0       1
-47965.784095              Gibbs Energy   G_DFT             wB97X-V        def2-TZVP   CPCM(Water) 0       1
-47902.494771              SP-Energy      SP_WF             DLPNO-CCSD(T)  def2-TZVP   CPCM(Water) 0       1
-47900.265926              Gibbs Energy   G WF              DLPNO-CCSD(T)  def2-TZVP   CPCM(Water) 0       1

Note

After a WEASEL job is completed the entries sorted block by block according to the energy or property in the first column. A block is defined by the meta information given in the columns following the first one.

Looking at the example, you will notice that the results of the preoptimization and the optimization are shown again. These results include not only the energy values, but also information about the chosen level of theory, the solvent, and the molecular multiplicity and charge.

Following these results are the results of the frequency calculation, labeled H_correction, -TS_correction and G_correction. These represent the enthalpy, entropy, and resulting Gibbs free energy, respectively, based on the frequency calculation after DFT optimization. Importantly, these calculations are performed at the same theoretical level as specified.

Note

Frequency calculation are always performed using the same theoretical level as for optimization (Opt).

Next, the result of the SP-DFT (i.e. a single point calculation at DFT-level) calculation is listed, along with the corresponding Gibbs free energy correction derived from the frequency calculation. Finally, a DLPNO-CCSD(T) calculation is performed, recording the energy at that level and the corresponding Gibbs free energy.