GROMACS: Difference between revisions
(54 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
= General = | = General = | ||
Line 17: | Line 18: | ||
Researchers using '''GROMACS''' on ARC are expected to be generally familiar with its capabilities, | Researchers using '''GROMACS''' on ARC are expected to be generally familiar with its capabilities, | ||
input file types and their formats and the use of checkpoint files to restart | input file types and their formats and the use of checkpoint files to restart simulations. | ||
Like other calculations on ARC systems, '''GROMACS''' is run by submitting an appropriate script for batch scheduling using the '''sbatch''' command. | Like other calculations on ARC systems, '''GROMACS''' is run by submitting an appropriate script for batch scheduling using the '''sbatch''' command. | ||
For more information about submitting jobs, see [[Running jobs]] article. | For more information about submitting jobs, see [[Running jobs]] article. | ||
== GROMACS modules == | |||
Currently there are several software modules on ARC that provide different versions of '''GROMACS'''. | Currently there are several software modules on ARC that provide different versions of '''GROMACS'''. | ||
The versions differ in the release date as well as | The versions differ in the release date as well as the CPU architecture the software is compiled for. | ||
You can see them using the <code>module</code> command: | You can see them using the <code>module</code> command: | ||
Line 32: | Line 33: | ||
----------- /global/software/Modules/3.2.10/modulefiles --------- | ----------- /global/software/Modules/3.2.10/modulefiles --------- | ||
gromacs/2016.3-gnu | gromacs/2016.3-gnu gromacs/2018.0-gnu gromacs/2019.6-legacy gromacs/2019.6-skylake | ||
gromacs/2018.0-gnu | gromacs/2020.6 gromacs/2022.6 gromacs/2023.4 gromacs/2024.3 | ||
gromacs/2019.6-legacy | |||
gromacs/2019.6-skylake | |||
gromacs/ | |||
</pre> | </pre> | ||
The names of the modules give hints on the specific version of '''GROMACS''' they provide access to. | The names of the modules give hints on the specific version of '''GROMACS''' they provide access to. | ||
* The '''gnu''' suffix indicates that those versions have been compiled with GNU GCC compiler. | * The '''gnu''' suffix indicates that those versions have been compiled with GNU GCC compiler. | ||
* ''' | === Module specific information === | ||
* Version '''2024.3''' was built for '''OpenMP''' (<code>gmx</code>) and '''OpenMP + MPI''' (<code>gmx_mpi</code>) programming models. | |||
: '''GPU''' support is provided by '''CUDA 12.1.1'''. | |||
: '''GCC 9.4.0''' was used for this build. Version 9 is the minimal requirement to build this version of Gromacs. | |||
: Use <code>gmx --version</code> and <code>gmx_mpi --version</code> for more details. | |||
* Version '''2023.4''' was built for '''OpenMP''' (<code>gmx</code>) and '''OpenMP + MPI''' (<code>gmx_mpi</code>) programming models. | |||
: Due to technical limitations only the '''OpenMP + MPI''' build has '''GPU''' support via '''CUDA 12.1.1'''. | |||
: '''GCC 9.4.0''' was used for this build. Version 9 is the minimal requirement to build this version of Gromacs. | |||
* Version '''2022.6''' was built for '''OpenMP''' (<code>gmx</code>) and '''OpenMP + MPI''' (<code>gmx_mpi</code>) programming models. | |||
: '''GPU''' support is provided by '''CUDA 12.1.1'''. | |||
: '''GCC 8.5.0''' was used for this build. | |||
* Version '''2020.6''' was built for '''OpenMP''' (<code>gmx</code>) and '''OpenMP + MPI''' (<code>gmx_mpi</code>) programming models. | |||
: Due to technical limitation there is '''no GPU''' support for these builds. | |||
: '''GCC 8.3.1''' was used for this build. | |||
* | * Version '''2019.6''' was compiled for two different CPU kinds, the old kind, '''legacy''', and the new kind, '''skylake'''. | ||
: The '''legacy''' module should be used on compute nodes before 2019, and the '''skylake''' module is for node from 2019 and up. | |||
: '''GCC 7.3.0''' was used for this build. | |||
=== Loading Gromacs modules === | |||
A module has to be loaded before '''GROMACS''' can be used on ARC. Like this: | A module has to be loaded before '''GROMACS''' can be used on ARC. Like this: | ||
Line 84: | Line 106: | ||
</pre> | </pre> | ||
== Running a GROMACS Job == | |||
To run your simulation on ARC cluster you have to have: | To run your simulation on ARC cluster you have to have: | ||
Line 90: | Line 112: | ||
(2) a SLURM job script '''.slurm'''. | (2) a SLURM job script '''.slurm'''. | ||
Place | Place your input files for a simulation into a separate directory and | ||
prepare an appropriate '''job script''' for it. | prepare an appropriate '''job script''' for it. | ||
<pre> | <pre> | ||
Line 112: | Line 134: | ||
* <code>job.slurm</code> -- a '''SLURM jobs script''' that is used to submit this calculation to the cluster. | * <code>job.slurm</code> -- a '''SLURM jobs script''' that is used to submit this calculation to the cluster. | ||
On ARC, you can get an example with the files shown above with the commands: | |||
'''On ARC''', you can get '''an example''' with the files shown above with the commands: | |||
<pre> | <pre> | ||
$ tar xvf /global/software/gromacs/tests-2019/bilayer.tar.bz2 | $ tar xvf /global/software/gromacs/tests-2019/bilayer.tar.bz2 | ||
Line 177: | Line 200: | ||
The <code>md.log</code> is the '''next''' thing to look into. | The <code>md.log</code> is the '''next''' thing to look into. | ||
If everything as expected, then | If everything is as expected, then your '''computation is done''' and you can use the results. | ||
'''Success!''' | '''Success!''' | ||
Line 192: | Line 215: | ||
=== The job script === | === The job script === | ||
If you have input files as in the example above, and you use | If you have input files as in the example above, and you use your own computer, | ||
then to run the simulation, | then to run the simulation, | ||
you have to generate a binary input file for the <code>mdrun</code> '''GROMACS''' command first. | you have to generate a binary input file for the <code>mdrun</code> '''GROMACS''' command first. | ||
Line 275: | Line 298: | ||
'''job-single.slurm''': | '''job-single.slurm''': | ||
< | <syntaxhighlight lang=bash> | ||
#!/bin/bash | #!/bin/bash | ||
# ================================================================ | # ================================================================ | ||
Line 315: | Line 338: | ||
echo "Done at `date`." | echo "Done at `date`." | ||
# ================================================================ | # ================================================================ | ||
</ | </syntaxhighlight> | ||
==== Multi node (modern) job script ==== | ==== Multi node (modern) job script ==== | ||
Line 326: | Line 349: | ||
'''job-multi.slurm''': | '''job-multi.slurm''': | ||
< | <syntaxhighlight lang=bash> | ||
#!/bin/bash | #!/bin/bash | ||
# ================================================================ | # ================================================================ | ||
Line 366: | Line 389: | ||
echo "Done at `date`." | echo "Done at `date`." | ||
# ================================================================ | # ================================================================ | ||
</ | </syntaxhighlight> | ||
==== Single node (legacy) job script ==== | ==== Single node (legacy) job script ==== | ||
Line 380: | Line 403: | ||
'''job-single-legacy.slurm''': | '''job-single-legacy.slurm''': | ||
< | <syntaxhighlight lang=bash> | ||
#!/bin/bash | #!/bin/bash | ||
# ================================================================ | # ================================================================ | ||
Line 422: | Line 445: | ||
echo "Done at `date`." | echo "Done at `date`." | ||
# ================================================================ | # ================================================================ | ||
</ | </syntaxhighlight> | ||
==== Multi node (legacy) job script ==== | ==== Multi node (legacy) job script ==== | ||
Line 435: | Line 458: | ||
'''job-multi.slurm''': | '''job-multi.slurm''': | ||
< | <syntaxhighlight lang=bash> | ||
#!/bin/bash | #!/bin/bash | ||
# ================================================================ | # ================================================================ | ||
Line 477: | Line 500: | ||
echo "Done at `date`." | echo "Done at `date`." | ||
# ================================================================ | # ================================================================ | ||
</ | </syntaxhighlight> | ||
==== Tip ==== | ==== Tip ==== | ||
Sometimes it is useful to run preprocessor to generate the binary input before submitting a job. | Sometimes it is useful to run the preprocessor to generate the binary input before submitting a job. | ||
In the example above there had been two notes about the setup and you have to make sure that | In the example above there had been two notes about the setup and you have to make sure that | ||
you are ready to continue with the simulation despite them. | you are ready to continue with the simulation despite them. | ||
Line 494: | Line 517: | ||
different parallelization options and number of CPUs. | different parallelization options and number of CPUs. | ||
The ''' | The '''bilayer''' system of ~105000 atoms was simulated for 100000 steps. | ||
=== Skylake partitions === | |||
<pre> | <pre> | ||
---------------------------------------------------------------------- | ------------------------------------------------------------------------------- | ||
#CPUs | #CPUs #Nodes Processes Threads Wall Time Performance Efficiency | ||
per node per proc (s) (ns/day) | per node per proc (s) (ns/day) (%) | ||
---------------------------------------------------------------------- | ------------------------------------------------------------------------------- | ||
1 1 1 1 1031.6 0.84 | 1 1 1 1 1031.6 0.84 100.0 | ||
10 1 1 10 119.0 7.26 | 10 1 1 10 119.0 7.26 86.4 | ||
20 1 1 20 65.9 13.11 | 20 1 1 20 65.9 13.11 78.0 | ||
40 1 1 40 37.6 22.97 | 40 1 1 40 37.6 22.97 68.4 | ||
10 1 10 1 126.4 6.83 | 10 1 10 1 126.4 6.83 81.3 | ||
20 1 20 1 78.0 11.08 | 20 1 20 1 78.0 11.08 66.0 | ||
40 1 40 1 45.3 19.07 | 40 1 40 1 45.3 19.07 56.8 | ||
40 1 20 2 48.3 17.90 | 40 1 20 2 48.3 17.90 53.3 | ||
40 1 10 4 43.9 19.68 | 40 1 10 4 43.9 19.68 58.6 | ||
36 1 6 6 49.4 17.50 | 36 1 6 6 49.4 17.50 57.9 | ||
80 2 10 4 30.5 28.34 | 80 2 10 4 30.5 28.34 42.2 | ||
80 2 20 2 32.6 26.53 | 80 2 20 2 32.6 26.53 39.4 | ||
80 2 40 1 30.3 28.49 | 80 2 40 1 30.3 28.49 42.4 | ||
120 3 40 1 20.8 41.51 | 120 3 40 1 20.8 41.51 41.2 | ||
160 4 40 1 19.2 44.95 | 160 4 40 1 19.2 44.95 33.4 | ||
---------------------------------------------------------------------- | ------------------------------------------------------------------------------- | ||
</pre> | |||
'''Observations''': | |||
* If you want to run the job on a '''single node''', then use '''1 process''' with as many threads as the number of CPUs you request. | |||
* If you need '''more than 1 node''', then run '''1-threaded MPI processes''' for each CPU you request. | |||
* Going '''beyond 3 nodes''' may not be computationally efficient on ARC. | |||
'''gpu-v100''' partition: | |||
<pre> | |||
------------------------------------------------------------------------------------- | |||
#CPUs #GPUs #Nodes Processes Threads Wall Time Performance Efficiency | |||
per node per proc (s) (ns/day) (%) | |||
------------------------------------------------------------------------------------- | |||
1 0 1 1 1 1031.6 0.84 100.0 | |||
4 1 1 1 4 73.8 11.71 | |||
10 1 1 1 10 39.4 21.92 | |||
20 1 1 1 20 123.3 7.01 ?? | |||
20 1 1 1 20 33.0 26.22 ?? | |||
40 1 1 1 40 35.0 24.68 | |||
40 2 1 1 40 34.4 25.10 | |||
16 2 1 2 8 164.1 5.27 | |||
40 2 1 10 4 37.2 23.23 | |||
80 2 2 10 2 34.1 25.34 | |||
------------------------------------------------------------------------------- | |||
</pre> | |||
=== Legacy partitions === | |||
The '''parallel''' partition: | |||
<pre> | |||
------------------------------------------------------------------------------- | |||
#CPUs #Nodes Processes Threads Wall Time Performance Efficiency | |||
per node per proc (s) (ns/day) (%) | |||
------------------------------------------------------------------------------- | |||
1 1 1 1 2732.2 0.32 100.0 | |||
6 1 1 6 518.4 1.67 | |||
12 1 1 12 270.0 3.20 | |||
12 1 12 1 283.8 3.04 | |||
24 2 12 1 139.5 6.20 | |||
36 3 12 1 93.5 9.25 | |||
48 4 12 1 71.0 12.17 | |||
72 6 12 1 55.9 15.46 | |||
96 8 12 1 45.1 19.16 | |||
120 10 12 1 40.8 21.16 | |||
144 12 12 1 35.0 24.70 | |||
196 16 12 1 31.1 27.75 | |||
------------------------------------------------------------------------------- | |||
</pre> | |||
The '''lattice''' partition: | |||
<pre> | |||
------------------------------------------------------------------------------- | |||
#CPUs #Nodes Processes Threads Wall Time Performance Efficiency | |||
per node per proc (s) (ns/day) (%) | |||
------------------------------------------------------------------------------- | |||
1 1 1 1 2732.2 0.32 100.0 | |||
8 1 8 1 | |||
------------------------------------------------------------------------------- | |||
</pre> | </pre> | ||
Line 818: | Line 911: | ||
Seed for replica exchange, -1 is generate a seed | Seed for replica exchange, -1 is generate a seed | ||
</pre> | </pre> | ||
= Support = | |||
Please send any questions regarding using GROMACS on ARC to support@hpc.ucalgary.ca. | |||
= Links = | |||
[[ARC Software pages]] | |||
[[Category:GROMACS]] | |||
[[Category:Software]] | |||
[[Category:ARC]] | |||
{{Navbox ARC}} |
Latest revision as of 22:07, 24 October 2024
General
- GROMACS home site: http://www.gromacs.org/
GROMACS is a versatile package to perform molecular dynamics,
i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.
Using GROMACS on ARC
Researchers using GROMACS on ARC are expected to be generally familiar with its capabilities, input file types and their formats and the use of checkpoint files to restart simulations.
Like other calculations on ARC systems, GROMACS is run by submitting an appropriate script for batch scheduling using the sbatch command. For more information about submitting jobs, see Running jobs article.
GROMACS modules
Currently there are several software modules on ARC that provide different versions of GROMACS. The versions differ in the release date as well as the CPU architecture the software is compiled for.
You can see them using the module
command:
$ module avail gromacs ----------- /global/software/Modules/3.2.10/modulefiles --------- gromacs/2016.3-gnu gromacs/2018.0-gnu gromacs/2019.6-legacy gromacs/2019.6-skylake gromacs/2020.6 gromacs/2022.6 gromacs/2023.4 gromacs/2024.3
The names of the modules give hints on the specific version of GROMACS they provide access to.
- The gnu suffix indicates that those versions have been compiled with GNU GCC compiler.
Module specific information
- Version 2024.3 was built for OpenMP (
gmx
) and OpenMP + MPI (gmx_mpi
) programming models.
- GPU support is provided by CUDA 12.1.1.
- GCC 9.4.0 was used for this build. Version 9 is the minimal requirement to build this version of Gromacs.
- Use
gmx --version
andgmx_mpi --version
for more details.
- Version 2023.4 was built for OpenMP (
gmx
) and OpenMP + MPI (gmx_mpi
) programming models.
- Due to technical limitations only the OpenMP + MPI build has GPU support via CUDA 12.1.1.
- GCC 9.4.0 was used for this build. Version 9 is the minimal requirement to build this version of Gromacs.
- Version 2022.6 was built for OpenMP (
gmx
) and OpenMP + MPI (gmx_mpi
) programming models.
- GPU support is provided by CUDA 12.1.1.
- GCC 8.5.0 was used for this build.
- Version 2020.6 was built for OpenMP (
gmx
) and OpenMP + MPI (gmx_mpi
) programming models.
- Due to technical limitation there is no GPU support for these builds.
- GCC 8.3.1 was used for this build.
- Version 2019.6 was compiled for two different CPU kinds, the old kind, legacy, and the new kind, skylake.
- The legacy module should be used on compute nodes before 2019, and the skylake module is for node from 2019 and up.
- GCC 7.3.0 was used for this build.
Loading Gromacs modules
A module has to be loaded before GROMACS can be used on ARC. Like this:
$ gmx --version bash: gmx: command not found... $ module load gromacs/2019.6-legacy $ gmx --version :-) GROMACS - gmx, 2019.6 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen ..... ..... GROMACS version: 2019.6 Precision: single Memory model: 64 bit MPI library: none OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64) GPU support: CUDA SIMD instructions: SSE4.1 FFT library: fftw-3.3.7-sse2-avx RDTSCP usage: enabled TNG support: enabled Hwloc support: hwloc-1.11.8 Tracing support: disabled C compiler: /global/software/gcc/gcc-7.3.0/bin/gcc GNU 7.3.0 C compiler flags: -msse4.1 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast C++ compiler: /global/software/gcc/gcc-7.3.0/bin/g++ GNU 7.3.0 C++ compiler flags: -msse4.1 -std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast CUDA compiler: /global/software/cuda/cuda-10.0.130/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2018 NVIDIA Corporation;Built on Sat_Aug_25_21:08:01_CDT_2018;Cuda compilation tools, release 10.0, V10.0.130 CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;; ;-msse4.1;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast; CUDA driver: 9.10 CUDA runtime: N/A
Running a GROMACS Job
To run your simulation on ARC cluster you have to have: (1) a set of GROMACS input files and (2) a SLURM job script .slurm.
Place your input files for a simulation into a separate directory and prepare an appropriate job script for it.
$ ls -l -rw-r--r-- 1 drozmano drozmano 7224622 Jan 29 2014 bilayer.gro -rw-r--r-- 1 drozmano drozmano 2567 May 21 12:14 bilayer.mdp -rw-r--r-- 1 drozmano drozmano 87 Apr 29 2016 bilayer.top drwxr-xr-x 1 drozmano drozmano 200 May 20 16:27 ff -rw-r--r-- 1 drozmano drozmano 504 May 20 16:30 ff.top -rwxr-xr-x 1 drozmano drozmano 1171 May 21 13:45 job.slurm
Here:
bilayer.top
-- contains the topology of the system.bilayer.gro
-- contains initial configuration (positions of atoms) of the system.bilayer.mdp
-- contains parameters of the simulation run (GROMACS settings).ff
-- a directory containing external custom force field (models). Not required if only standard models are used.ff.top
-- a topology file that includes required models from available force fields.
- This file is included by the
bilayer.top
file.
job.slurm
-- a SLURM jobs script that is used to submit this calculation to the cluster.
On ARC, you can get an example with the files shown above with the commands:
$ tar xvf /global/software/gromacs/tests-2019/bilayer.tar.bz2 $ cd bilayer $ ls -l ....
At this point you can submit your job to ARC's scheduler (SLURM).
$ sbatch job.slurm Submitted batch job 5570681
You can check the status of the job using the job ID from the confirmation message above.
$ squeue -j 5570681 JOBID USER STATE TIME_LIMIT TIME NODES TASKS CPUS MIN_MEMORY NODELIST 5570681 drozmano RUNNING 5:00:00 0:15 1 1 40 8G fc1
The squeue
command output may look different in your case depending on your settings.
After the job is over and it does not show in the squeue
output
we can check the results.
$ ls -l -rw-r--r-- 1 drozmano drozmano 7224622 Jan 29 2014 bilayer.gro -rw-r--r-- 1 drozmano drozmano 2567 May 21 12:14 bilayer.mdp -rw-r--r-- 1 drozmano drozmano 87 Apr 29 2016 bilayer.top -rw-r--r-- 1 drozmano drozmano 2594772 May 21 14:37 bilayer.tpr -rw-r--r-- 1 drozmano drozmano 7224622 May 21 14:37 confout.gro -rw-r--r-- 1 drozmano drozmano 1772 May 21 14:37 ener.edr drwxr-xr-x 1 drozmano drozmano 200 May 20 16:27 ff -rw-r--r-- 1 drozmano drozmano 504 May 20 16:30 ff.top -rwxr-xr-x 1 drozmano drozmano 1172 May 21 14:36 job.slurm -rw-r--r-- 1 drozmano drozmano 82051 May 21 14:37 md.log -rw-r--r-- 1 drozmano drozmano 10949 May 21 14:37 mdout.mdp -rw-r--r-- 1 drozmano drozmano 13162 May 21 14:37 slurm-5570681.out -rw-r--r-- 1 drozmano drozmano 2514832 May 21 14:37 state.cpt -rw-r--r-- 1 drozmano drozmano 4309444 May 21 14:37 traj_comp.xtc
The new files here are:
bilayer.tpr
-- Binary input file that is generated by thegmx grompp
before the actual simulation.
- This file can be generated before submitting the job, if needed. Here it is created in the job script.
confout.gro
-- a configuration file containing the final atomic positions at the end of the simulation.ener.edr
-- a file with energy data. Can be used for later analysis.md.log
-- the main log file for the simulation.mdout.mdp
-- a file containing all simulation parameters as used by GROMACS. Based onbilayer.mdp
.state.cpt
-- a binary checkpoint file containing the system state at the end of the simulation.
- This file can be used to continue simulations further.
traj_com.xtc
-- a trajectory file containing atomic positions at some time points during the simulation.
slurm-5570681.out
-- The intercept of the output printed on screen during the simulation.
- Done by SLURM for you. The number in the name is the job ID of the job.
If something is not working the way you expected, then the slurm-5570681.out
file is the first place you should examine.
The md.log
is the next thing to look into.
If everything is as expected, then your computation is done and you can use the results. Success!
You may want to check the output file and the main log anyways:
# Press "q" to exit the text viewer. $ less slurm-5570681.out .... $ less md.log ....
The job script
If you have input files as in the example above, and you use your own computer,
then to run the simulation,
you have to generate a binary input file for the mdrun
GROMACS command first.
$ gmx grompp -v -f bilayer.mdp -c bilayer.gro -p bilayer.top -o bilayer.tpr :-) GROMACS - gmx grompp, 2019.6 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen ..... ..... Checking consistency between energy and charge groups... Calculating fourier grid dimensions for X Y Z Using a fourier grid of 112x112x52, spacing 0.116 0.116 0.112 Estimate for the relative computational load of the PME mesh part: 0.13 This run will generate roughly 14 Mb of data writing run input file... There were 2 notes $ ls -l -rw-r--r-- 1 drozmano drozmano 7224622 Jan 29 2014 bilayer.gro -rw-r--r-- 1 drozmano drozmano 2567 May 21 12:14 bilayer.mdp -rw-r--r-- 1 drozmano drozmano 87 Apr 29 2016 bilayer.top -rw-r--r-- 1 drozmano drozmano 2594772 May 21 15:06 bilayer.tpr drwxr-xr-x 1 drozmano drozmano 200 May 20 16:27 ff -rw-r--r-- 1 drozmano drozmano 504 May 20 16:30 ff.top
At this point you can run the simulation like this:
$ gmx mdrun -v -s bilayer.tpr :-) GROMACS - gmx grompp, 2019.6 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen ...... ... lots-of-output ... ...... Writing final coordinates. ^Mstep 10000, remaining wall clock time: 0 s Core t (s) Wall t (s) (%) Time: 1557.342 38.939 3999.4 (ns/day) (hour/ns) Performance: 22.191 1.082 GROMACS reminds you: "In ..."
To run this calculation on a cluster via the SLURM scheduling system you have to provide a job script that does two things:
- Provides steps that run the simulation, and
- Requests all necessary computational resources that are needed for this simulation.
The steps to run the simulation you already know:
- Load a desired GROMACS module to activate the software.
- Generate the binary input, the .tpr file.
- Run the mdrun command on it.
The computational resources for the run include:
- the Number of CPUs,
- the Amount of memory (RAM), and
- the Time sufficient to complete the computation.
Below, several examples of job scripts are given.
These scripts are suitable for use on ARC depending on
- what part of the cluster the job should run on and
- the number of CPUs the job is going to use.
Single node (modern) job script
This script is for jobs that use up to one full modern node (2019 and later). These nodes are in the list of default partitions and have 40 CPUs each.
This specific example requests 40 CPUs on 1 node for 5 hours.
It also requests 16GB of RAM on the node.
The simulation runs as 1 process of 40 threads.
job-single.slurm:
#!/bin/bash
# ================================================================
#SBATCH --job-name=gro_test
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=40
#SBATCH --mem=16GB
#SBATCH --time=0-05:00:00
# ================================================================
module purge
module load gromacs/2019.6-skylake
# ================================================================
echo "Starting at `date`."
echo "========================="
# Input files.
MDP=bilayer.mdp
TOP=bilayer.top
GRO=bilayer.gro
# Binary input file to generate.
TPR=bilayer.tpr
# Preprocess the input files.
gmx grompp -v -f $MDP -c $GRO -p $TOP -o $TPR
# Check if preprocessing have gone well.
if ! test -e $TPR; then
echo "ERROR: Could not create a TPR file for the run. Aborting."
exit
fi
# Run the simulation.
OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
gmx mdrun -v -s $TPR
echo "========================="
echo "Done at `date`."
# ================================================================
Multi node (modern) job script
This script is for jobs that use several modern nodes (2019 and later). These nodes are in the list of default partitions and have 40 CPUs each.
This specific example requests 80 CPUs on 2 nodes (40 CPUs each) for 5 hours.
It also requests 16GB of RAM on each node. The simulation runs as 80 one-threaded processes.
job-multi.slurm:
#!/bin/bash
# ================================================================
#SBATCH --job-name=gro_test
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=40
#SBATCH --cpus-per-task=1
#SBATCH --mem=16GB
#SBATCH --time=0-05:00:00
# ================================================================
module purge
module load gromacs/2019.6-skylake
# ================================================================
echo "Starting at `date`."
echo "========================="
# Input files.
MDP=bilayer.mdp
TOP=bilayer.top
GRO=bilayer.gro
# Binary input file to generate.
TPR=bilayer.tpr
# Preprocess the input files.
gmx grompp -v -f $MDP -c $GRO -p $TOP -o $TPR
# Check if preprocessing have gone well.
if ! test -e $TPR; then
echo "ERROR: Could not create a TPR file for the run. Aborting."
exit
fi
# Run the simulation.
OMP_NUM_THREADS=1
mpiexec gmx_mpi mdrun -v -s $TPR
echo "========================="
echo "Done at `date`."
# ================================================================
Single node (legacy) job script
This script is for jobs that use up to one full legacy node (parallel, cpu2013, lattice partitions). The partition has to be specified in the resource request.
This specific example requests 12 CPUs on 1 node in the parallel legacy partition for 5 hours.
It also requests 23GB of RAM on the node (all available memory on this kind).
The simulation runs as 1 process of 12 threads.
job-single-legacy.slurm:
#!/bin/bash
# ================================================================
#SBATCH --job-name=gro_test
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=12
#SBATCH --mem=23GB
#SBATCH --time=0-12:00:00
#SBATCH --partition=parallel
# ================================================================
module purge
module load gromacs/2019.6-legacy
# ================================================================
echo "Starting at `date`."
echo "========================="
# Input files.
MDP=bilayer.mdp
TOP=bilayer.top
GRO=bilayer.gro
# Binary input file to generate.
TPR=bilayer.tpr
# Preprocess the input files.
gmx grompp -v -f $MDP -c $GRO -p $TOP -o $TPR
# Check if preprocessing have gone well.
if ! test -e $TPR; then
echo "ERROR: Could not create a TPR file for the run. Aborting."
exit
fi
# Run the simulation.
OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
gmx mdrun -v -s $TPR
echo "========================="
echo "Done at `date`."
# ================================================================
Multi node (legacy) job script
This script is for jobs that use several legacy node (parallel, cpu2013, lattice partitions). The partition has to be specified in the resource request.
This specific example requests 48 CPUs on 4 nodes in the parallel legacy partition (12 CPUs each) for 5 hours.
It also requests 23GB of RAM on each node. The simulation runs as 48 one-threaded processes.
job-multi.slurm:
#!/bin/bash
# ================================================================
#SBATCH --job-name=gro_test
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=12
#SBATCH --cpus-per-task=1
#SBATCH --mem=23GB
#SBATCH --time=0-05:00:00
#SBATCH --partition=parallel
# ================================================================
module purge
module load gromacs/2019.6-legacy
# ================================================================
echo "Starting at `date`."
echo "========================="
# Input files.
MDP=bilayer.mdp
TOP=bilayer.top
GRO=bilayer.gro
# Binary input file to generate.
TPR=bilayer.tpr
# Preprocess the input files.
gmx grompp -v -f $MDP -c $GRO -p $TOP -o $TPR
# Check if preprocessing have gone well.
if ! test -e $TPR; then
echo "ERROR: Could not create a TPR file for the run. Aborting."
exit
fi
# Run the simulation.
OMP_NUM_THREADS=1
mpiexec gmx_mpi mdrun -v -s $TPR
echo "========================="
echo "Done at `date`."
# ================================================================
Tip
Sometimes it is useful to run the preprocessor to generate the binary input before submitting a job. In the example above there had been two notes about the setup and you have to make sure that you are ready to continue with the simulation despite them.
Also, if there are problems with your input you will be able to see them right away without going through job submission.
Misc
Performance
Performance measurements for GROMACS 2019.6 on the 2019 compute nodes using different parallelization options and number of CPUs.
The bilayer system of ~105000 atoms was simulated for 100000 steps.
Skylake partitions
------------------------------------------------------------------------------- #CPUs #Nodes Processes Threads Wall Time Performance Efficiency per node per proc (s) (ns/day) (%) ------------------------------------------------------------------------------- 1 1 1 1 1031.6 0.84 100.0 10 1 1 10 119.0 7.26 86.4 20 1 1 20 65.9 13.11 78.0 40 1 1 40 37.6 22.97 68.4 10 1 10 1 126.4 6.83 81.3 20 1 20 1 78.0 11.08 66.0 40 1 40 1 45.3 19.07 56.8 40 1 20 2 48.3 17.90 53.3 40 1 10 4 43.9 19.68 58.6 36 1 6 6 49.4 17.50 57.9 80 2 10 4 30.5 28.34 42.2 80 2 20 2 32.6 26.53 39.4 80 2 40 1 30.3 28.49 42.4 120 3 40 1 20.8 41.51 41.2 160 4 40 1 19.2 44.95 33.4 -------------------------------------------------------------------------------
Observations:
- If you want to run the job on a single node, then use 1 process with as many threads as the number of CPUs you request.
- If you need more than 1 node, then run 1-threaded MPI processes for each CPU you request.
- Going beyond 3 nodes may not be computationally efficient on ARC.
gpu-v100 partition:
------------------------------------------------------------------------------------- #CPUs #GPUs #Nodes Processes Threads Wall Time Performance Efficiency per node per proc (s) (ns/day) (%) ------------------------------------------------------------------------------------- 1 0 1 1 1 1031.6 0.84 100.0 4 1 1 1 4 73.8 11.71 10 1 1 1 10 39.4 21.92 20 1 1 1 20 123.3 7.01 ?? 20 1 1 1 20 33.0 26.22 ?? 40 1 1 1 40 35.0 24.68 40 2 1 1 40 34.4 25.10 16 2 1 2 8 164.1 5.27 40 2 1 10 4 37.2 23.23 80 2 2 10 2 34.1 25.34 -------------------------------------------------------------------------------
Legacy partitions
The parallel partition:
------------------------------------------------------------------------------- #CPUs #Nodes Processes Threads Wall Time Performance Efficiency per node per proc (s) (ns/day) (%) ------------------------------------------------------------------------------- 1 1 1 1 2732.2 0.32 100.0 6 1 1 6 518.4 1.67 12 1 1 12 270.0 3.20 12 1 12 1 283.8 3.04 24 2 12 1 139.5 6.20 36 3 12 1 93.5 9.25 48 4 12 1 71.0 12.17 72 6 12 1 55.9 15.46 96 8 12 1 45.1 19.16 120 10 12 1 40.8 21.16 144 12 12 1 35.0 24.70 196 16 12 1 31.1 27.75 -------------------------------------------------------------------------------
The lattice partition:
------------------------------------------------------------------------------- #CPUs #Nodes Processes Threads Wall Time Performance Efficiency per node per proc (s) (ns/day) (%) ------------------------------------------------------------------------------- 1 1 1 1 2732.2 0.32 100.0 8 1 8 1 -------------------------------------------------------------------------------
Selected GROMACS commands
gmx
SYNOPSIS gmx [-[no]h] [-[no]quiet] [-[no]version] [-[no]copyright] [-nice <int>] [-[no]backup] OPTIONS Other options: -[no]h (no) Print help and quit -[no]quiet (no) Do not print common startup info or quotes -[no]version (no) Print extended version information and quit -[no]copyright (yes) Print copyright information on startup -nice <int> (19) Set the nicelevel (default depends on command) -[no]backup (yes) Write backups if output files exist Additional help is available on the following topics: commands List of available commands selections Selection syntax and usage To access the help, use 'gmx help <topic>'. For help on a command, use 'gmx help <command>'.
gmx grompp
Preprocess input files.
$ gmx help grompp SYNOPSIS gmx grompp [-f [<.mdp>]] [-c [<.gro/.g96/...>]] [-r [<.gro/.g96/...>]] [-rb [<.gro/.g96/...>]] [-n [<.ndx>]] [-p [<.top>]] [-t [<.trr/.cpt/...>]] [-e [<.edr>]] [-ref [<.trr/.cpt/...>]] [-po [<.mdp>]] [-pp [<.top>]] [-o [<.tpr>]] [-imd [<.gro>]] [-[no]v] [-time <real>] [-[no]rmvsbds] [-maxwarn <int>] [-[no]zero] [-[no]renum] OPTIONS Options to specify input files: -f [<.mdp>] (grompp.mdp) grompp input file with MD parameters -c [<.gro/.g96/...>] (conf.gro) Structure file: gro g96 pdb brk ent esp tpr -r [<.gro/.g96/...>] (restraint.gro) (Opt.) Structure file: gro g96 pdb brk ent esp tpr -rb [<.gro/.g96/...>] (restraint.gro) (Opt.) Structure file: gro g96 pdb brk ent esp tpr -n [<.ndx>] (index.ndx) (Opt.) Index file -p [<.top>] (topol.top) Topology file -t [<.trr/.cpt/...>] (traj.trr) (Opt.) Full precision trajectory: trr cpt tng -e [<.edr>] (ener.edr) (Opt.) Energy file Options to specify input/output files: -ref [<.trr/.cpt/...>] (rotref.trr) (Opt.) Full precision trajectory: trr cpt tng Options to specify output files: -po [<.mdp>] (mdout.mdp) grompp input file with MD parameters -pp [<.top>] (processed.top) (Opt.) Topology file -o [<.tpr>] (topol.tpr) Portable xdr run input file -imd [<.gro>] (imdgroup.gro) (Opt.) Coordinate file in Gromos-87 format Other options: -[no]v (no) Be loud and noisy -time <real> (-1) Take frame at or first after this time. -[no]rmvsbds (yes) Remove constant bonded interactions with virtual sites -maxwarn <int> (0) Number of allowed warnings during input processing. Not for normal use and may generate unstable systems -[no]zero (no) Set parameters for bonded interactions without defaults to zero instead of generating an error -[no]renum (yes) Renumber atomtypes and minimize number of atomtypes
gmx mdrun
gmx mdrun is the main computational chemistry engine within GROMACS. It performs Molecular Dynamics simulations, but it can also perform Stochastic Dynamics, Energy Minimization, test particle insertion or (re)calculation of energies. Normal mode analysis is another option.
SYNOPSIS gmx mdrun [-s [<.tpr>]] [-cpi [<.cpt>]] [-table [<.xvg>]] [-tablep [<.xvg>]] [-tableb [<.xvg> [...]]] [-rerun [<.xtc/.trr/...>]] [-ei [<.edi>]] [-multidir [<dir> [...]]] [-awh [<.xvg>]] [-membed [<.dat>]] [-mp [<.top>]] [-mn [<.ndx>]] [-o [<.trr/.cpt/...>]] [-x [<.xtc/.tng>]] [-cpo [<.cpt>]] [-c [<.gro/.g96/...>]] [-e [<.edr>]] [-g [<.log>]] [-dhdl [<.xvg>]] [-field [<.xvg>]] [-tpi [<.xvg>]] [-tpid [<.xvg>]] [-eo [<.xvg>]] [-devout [<.xvg>]] [-runav [<.xvg>]] [-px [<.xvg>]] [-pf [<.xvg>]] [-ro [<.xvg>]] [-ra [<.log>]] [-rs [<.log>]] [-rt [<.log>]] [-mtx [<.mtx>]] [-if [<.xvg>]] [-swap [<.xvg>]] [-deffnm <string>] [-xvg <enum>] [-dd <vector>] [-ddorder <enum>] [-npme <int>] [-nt <int>] [-ntmpi <int>] [-ntomp <int>] [-ntomp_pme <int>] [-pin <enum>] [-pinoffset <int>] [-pinstride <int>] [-gpu_id <string>] [-gputasks <string>] [-[no]ddcheck] [-rdd <real>] [-rcon <real>] [-dlb <enum>] [-dds <real>] [-gcom <int>] [-nb <enum>] [-nstlist <int>] [-[no]tunepme] [-pme <enum>] [-pmefft <enum>] [-bonded <enum>] [-[no]v] [-pforce <real>] [-[no]reprod] [-cpt <real>] [-[no]cpnum] [-[no]append] [-nsteps <int>] [-maxh <real>] [-replex <int>] [-nex <int>] [-reseed <int>] OPTIONS Options to specify input files: -s [<.tpr>] (topol.tpr) Portable xdr run input file -cpi [<.cpt>] (state.cpt) (Opt.) Checkpoint file -table [<.xvg>] (table.xvg) (Opt.) xvgr/xmgr file -tablep [<.xvg>] (tablep.xvg) (Opt.) xvgr/xmgr file -tableb [<.xvg> [...]] (table.xvg) (Opt.) xvgr/xmgr file -rerun [<.xtc/.trr/...>] (rerun.xtc) (Opt.) Trajectory: xtc trr cpt gro g96 pdb tng -ei [<.edi>] (sam.edi) (Opt.) ED sampling input -multidir [<dir> [...]] (rundir) (Opt.) Run directory -awh [<.xvg>] (awhinit.xvg) (Opt.) xvgr/xmgr file -membed [<.dat>] (membed.dat) (Opt.) Generic data file -mp [<.top>] (membed.top) (Opt.) Topology file -mn [<.ndx>] (membed.ndx) (Opt.) Index file Options to specify output files: -o [<.trr/.cpt/...>] (traj.trr) Full precision trajectory: trr cpt tng -x [<.xtc/.tng>] (traj_comp.xtc) (Opt.) Compressed trajectory (tng format or portable xdr format) -cpo [<.cpt>] (state.cpt) (Opt.) Checkpoint file -c [<.gro/.g96/...>] (confout.gro) Structure file: gro g96 pdb brk ent esp -e [<.edr>] (ener.edr) Energy file -g [<.log>] (md.log) Log file -dhdl [<.xvg>] (dhdl.xvg) (Opt.) xvgr/xmgr file -field [<.xvg>] (field.xvg) (Opt.) xvgr/xmgr file -tpi [<.xvg>] (tpi.xvg) (Opt.) xvgr/xmgr file -tpid [<.xvg>] (tpidist.xvg) (Opt.) xvgr/xmgr file -eo [<.xvg>] (edsam.xvg) (Opt.) xvgr/xmgr file -devout [<.xvg>] (deviatie.xvg) (Opt.) xvgr/xmgr file -runav [<.xvg>] (runaver.xvg) (Opt.) xvgr/xmgr file -px [<.xvg>] (pullx.xvg) (Opt.) xvgr/xmgr file -pf [<.xvg>] (pullf.xvg) (Opt.) xvgr/xmgr file -ro [<.xvg>] (rotation.xvg) (Opt.) xvgr/xmgr file -ra [<.log>] (rotangles.log) (Opt.) Log file -rs [<.log>] (rotslabs.log) (Opt.) Log file -rt [<.log>] (rottorque.log) (Opt.) Log file -mtx [<.mtx>] (nm.mtx) (Opt.) Hessian matrix -if [<.xvg>] (imdforces.xvg) (Opt.) xvgr/xmgr file -swap [<.xvg>] (swapions.xvg) (Opt.) xvgr/xmgr file Other options: -deffnm <string> Set the default filename for all file options -xvg <enum> (xmgrace) xvg plot formatting: xmgrace, xmgr, none -dd <vector> (0 0 0) Domain decomposition grid, 0 is optimize -ddorder <enum> (interleave) DD rank order: interleave, pp_pme, cartesian -npme <int> (-1) Number of separate ranks to be used for PME, -1 is guess -nt <int> (0) Total number of threads to start (0 is guess) -ntmpi <int> (0) Number of thread-MPI ranks to start (0 is guess) -ntomp <int> (0) Number of OpenMP threads per MPI rank to start (0 is guess) -ntomp_pme <int> (0) Number of OpenMP threads per MPI rank to start (0 is -ntomp) -pin <enum> (auto) Whether mdrun should try to set thread affinities: auto, on, off -pinoffset <int> (0) The lowest logical core number to which mdrun should pin the first thread -pinstride <int> (0) Pinning distance in logical cores for threads, use 0 to minimize the number of threads per physical core -gpu_id <string> List of unique GPU device IDs available to use -gputasks <string> List of GPU device IDs, mapping each PP task on each node to a device -[no]ddcheck (yes) Check for all bonded interactions with DD -rdd <real> (0) The maximum distance for bonded interactions with DD (nm), 0 is determine from initial coordinates -rcon <real> (0) Maximum distance for P-LINCS (nm), 0 is estimate -dlb <enum> (auto) Dynamic load balancing (with DD): auto, no, yes -dds <real> (0.8) Fraction in (0,1) by whose reciprocal the initial DD cell size will be increased in order to provide a margin in which dynamic load balancing can act while preserving the minimum cell size. -gcom <int> (-1) Global communication frequency -nb <enum> (auto) Calculate non-bonded interactions on: auto, cpu, gpu -nstlist <int> (0) Set nstlist when using a Verlet buffer tolerance (0 is guess) -[no]tunepme (yes) Optimize PME load between PP/PME ranks or GPU/CPU (only with the Verlet cut-off scheme) -pme <enum> (auto) Perform PME calculations on: auto, cpu, gpu -pmefft <enum> (auto) Perform PME FFT calculations on: auto, cpu, gpu -bonded <enum> (auto) Perform bonded calculations on: auto, cpu, gpu -[no]v (no) Be loud and noisy -pforce <real> (-1) Print all forces larger than this (kJ/mol nm) -[no]reprod (no) Try to avoid optimizations that affect binary reproducibility -cpt <real> (15) Checkpoint interval (minutes) -[no]cpnum (no) Keep and number checkpoint files -[no]append (yes) Append to previous output files when continuing from checkpoint instead of adding the simulation part number to all file names -nsteps <int> (-2) Run this number of steps (-1 means infinite, -2 means use mdp option, smaller is invalid) -maxh <real> (-1) Terminate after 0.99 times this time (hours) -replex <int> (0) Attempt replica exchange periodically with this period (steps) -nex <int> (0) Number of random exchanges to carry out each exchange interval (N^3 is one suggestion). -nex zero or not specified gives neighbor replica exchange. -reseed <int> (-1) Seed for replica exchange, -1 is generate a seed
Support
Please send any questions regarding using GROMACS on ARC to support@hpc.ucalgary.ca.