GROMACS
General
- GROMACS home site: http://www.gromacs.org/
GROMACS is a versatile package to perform molecular dynamics,
i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.
Using GROMACS on ARC
Researchers using GROMACS on ARC are expected to be generally familiar with its capabilities, input file types and their formats and the use of checkpoint files to restart symulations.
Like other calculations on ARC systems, GROMACS is run by submitting an appropriate script for batch scheduling using the sbatch command. For more information about submitting jobs, see Running jobs article.
GROMACS modules
Currently there are several software modules on ARC that provide different versions of GROMACS. The versions differ in the release date as well as in the kind of the CPU architecture the software is compiled for.
You can see them using the module
command:
$ module avail gromacs ----------- /global/software/Modules/3.2.10/modulefiles --------- gromacs/2016.3-gnu gromacs/2018.0-gnu gromacs/2019.6-nehalem gromacs/2019.6-skylake gromacs/5.0.7-gnu
The names of the modules give hints on the specific version of GROMACS they provide access to.
- The gnu suffix indicates that those versions have been compiled with GNU GCC compiler.
- In these specific cases, GCC 4.8.5.
- GROMACS 2019.6 was compiled using GCC 7.3.0 for two different CPU kinds, the old kind, nehalem, and the new kind, skylake.
The nehalem module should be used on compute nodes before 2019, and the skylake module is for node from 2019 and up.
- All GROMACS versions provided by all the modules have support for GPU computations, even though it may not be practical to run it on GPU nodes due to limited resources.
A module has to be loaded before GROMACS can be used on ARC. Like this:
$ gmx --version bash: gmx: command not found... $ module load gromacs/2019.6-nehalem $ gmx --version :-) GROMACS - gmx, 2019.6 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen ..... ..... GROMACS version: 2019.6 Precision: single Memory model: 64 bit MPI library: none OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64) GPU support: CUDA SIMD instructions: SSE4.1 FFT library: fftw-3.3.7-sse2-avx RDTSCP usage: enabled TNG support: enabled Hwloc support: hwloc-1.11.8 Tracing support: disabled C compiler: /global/software/gcc/gcc-7.3.0/bin/gcc GNU 7.3.0 C compiler flags: -msse4.1 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast C++ compiler: /global/software/gcc/gcc-7.3.0/bin/g++ GNU 7.3.0 C++ compiler flags: -msse4.1 -std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast CUDA compiler: /global/software/cuda/cuda-10.0.130/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2018 NVIDIA Corporation;Built on Sat_Aug_25_21:08:01_CDT_2018;Cuda compilation tools, release 10.0, V10.0.130 CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;; ;-msse4.1;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast; CUDA driver: 9.10 CUDA runtime: N/A
Running a GROMACS Job
To run your simulation on ARC cluster you have to have: (1) a set of GROMACS input files and (2) a SLURM job script .slurm.
Place you input files for a simulation into a separate directory and prepare an appropriate job script for it.
$ ls -l -rw-r--r-- 1 drozmano drozmano 7224622 Jan 29 2014 bilayer.gro -rw-r--r-- 1 drozmano drozmano 2567 May 21 12:14 bilayer.mdp -rw-r--r-- 1 drozmano drozmano 87 Apr 29 2016 bilayer.top drwxr-xr-x 1 drozmano drozmano 200 May 20 16:27 ff -rw-r--r-- 1 drozmano drozmano 504 May 20 16:30 ff.top -rwxr-xr-x 1 drozmano drozmano 1171 May 21 13:45 job.slurm
Here:
bilayer.top
-- contains the topology of the system.bilayer.gro
-- contains initial configuration (positions of atoms) of the system.bilayer.mdp
-- contains parameters of the simulation run (GROMACS settings).ff
-- a directory containing external custom force field (models). Not required if only standard models are used.ff.top
-- a topology file that includes required models from available force fields.
- This file is included by the
bilayer.top
file.
job.slurm
-- a SLURM jobs script that is used to submit this calculation to the cluster.
At this point you can submit your job to ARC's scheduler (SLURM).
$ sbatch job.slurm Submitted batch job 5570681
You can check the status of the job using the job ID from the confirmation message above.
$ squeue -j 5570681 JOBID USER STATE TIME_LIMIT TIME NODES TASKS CPUS MIN_MEMORY NODELIST 5570681 drozmano RUNNING 5:00:00 0:15 1 1 40 8G fc1
The squeue
command output may look different in your case depending on your settings.
After the job is over and it does not show in the squeue
output
we can check the results.
$ ls -l -rw-r--r-- 1 drozmano drozmano 7224622 Jan 29 2014 bilayer.gro -rw-r--r-- 1 drozmano drozmano 2567 May 21 12:14 bilayer.mdp -rw-r--r-- 1 drozmano drozmano 87 Apr 29 2016 bilayer.top -rw-r--r-- 1 drozmano drozmano 2594772 May 21 14:37 bilayer.tpr -rw-r--r-- 1 drozmano drozmano 7224622 May 21 14:37 confout.gro -rw-r--r-- 1 drozmano drozmano 1772 May 21 14:37 ener.edr drwxr-xr-x 1 drozmano drozmano 200 May 20 16:27 ff -rw-r--r-- 1 drozmano drozmano 504 May 20 16:30 ff.top -rwxr-xr-x 1 drozmano drozmano 1172 May 21 14:36 job.slurm -rw-r--r-- 1 drozmano drozmano 82051 May 21 14:37 md.log -rw-r--r-- 1 drozmano drozmano 10949 May 21 14:37 mdout.mdp -rw-r--r-- 1 drozmano drozmano 13162 May 21 14:37 slurm-5570681.out -rw-r--r-- 1 drozmano drozmano 2514832 May 21 14:37 state.cpt -rw-r--r-- 1 drozmano drozmano 4309444 May 21 14:37 traj_comp.xtc
The new files here are:
bilayer.tpr
-- Binary input file that is generated by thegmx grompp
before the actual simulation.
- This file can be generated before submitting the job, if needed. Here it is created in the job script.
confout.gro
-- a configuration file containing the final atomic positions at the end of the simulation.ener.edr
-- a file with energy data. Can be used for later analysis.md.log
-- the main log file for the simulation.mdout.mdp
-- a file containing all simulation parameters as used by GROMACS. Based onbilayer.mdp
.state.cpt
-- a binary checkpoint file containing the system state at the end of the simulation.
- This file can be used to continue simulations further.
traj_com.xtc
-- a trajectory file containing atomic positions at some time points during the simulation.
slurm-5570681.out
-- The intercept of the output printed on screen during the simulation.
- Done by SLURM for you. The number in the name is the job ID of the job.
If something is not working the way you expected, then the slurm-5570681.out
file is the first place you should examine.
The md.log
is the next thing to look into.
If everything as expected, then you computation is done and you can use the results. Success!
You may want to check the output file and the main log anyways:
# Press "q" to exit the text viewer. $ less slurm-5570681.out $ less mdout.log
Misc
Performance
Performance measurements for GROMACS 2019.6 on the 2019 compute nodes using different parallelization options and number of CPUs.
The bilayer512 system of ~105000 atoms was simulated for 100000 steps.
---------------------------------------------------------------------- #CPUs Node Processes Threads Wall Time Performance per node per proc (s) (ns/day) ---------------------------------------------------------------------- 1 1 1 1 1031.6 0.84 10 1 1 10 119.0 7.26 20 1 1 20 65.9 13.11 40 1 1 40 37.6 22.97 10 1 10 1 126.4 6.83 20 1 20 1 78.0 11.08 40 1 40 1 45.3 19.07 40 1 20 2 48.3 17.90 40 1 10 4 43.9 19.68 36 1 6 6 49.4 17.50 80 2 10 4 30.5 28.34 80 2 20 2 32.6 26.53 80 2 40 1 30.3 28.49 120 3 40 1 20.8 41.51 160 4 40 1 19.2 44.95 ----------------------------------------------------------------------
Selected GROMACS commands
gmx
SYNOPSIS gmx [-[no]h] [-[no]quiet] [-[no]version] [-[no]copyright] [-nice <int>] [-[no]backup] OPTIONS Other options: -[no]h (no) Print help and quit -[no]quiet (no) Do not print common startup info or quotes -[no]version (no) Print extended version information and quit -[no]copyright (yes) Print copyright information on startup -nice <int> (19) Set the nicelevel (default depends on command) -[no]backup (yes) Write backups if output files exist Additional help is available on the following topics: commands List of available commands selections Selection syntax and usage To access the help, use 'gmx help <topic>'. For help on a command, use 'gmx help <command>'.
gmx grompp
Preprocess input files.
$ gmx help grompp SYNOPSIS gmx grompp [-f [<.mdp>]] [-c [<.gro/.g96/...>]] [-r [<.gro/.g96/...>]] [-rb [<.gro/.g96/...>]] [-n [<.ndx>]] [-p [<.top>]] [-t [<.trr/.cpt/...>]] [-e [<.edr>]] [-ref [<.trr/.cpt/...>]] [-po [<.mdp>]] [-pp [<.top>]] [-o [<.tpr>]] [-imd [<.gro>]] [-[no]v] [-time <real>] [-[no]rmvsbds] [-maxwarn <int>] [-[no]zero] [-[no]renum] OPTIONS Options to specify input files: -f [<.mdp>] (grompp.mdp) grompp input file with MD parameters -c [<.gro/.g96/...>] (conf.gro) Structure file: gro g96 pdb brk ent esp tpr -r [<.gro/.g96/...>] (restraint.gro) (Opt.) Structure file: gro g96 pdb brk ent esp tpr -rb [<.gro/.g96/...>] (restraint.gro) (Opt.) Structure file: gro g96 pdb brk ent esp tpr -n [<.ndx>] (index.ndx) (Opt.) Index file -p [<.top>] (topol.top) Topology file -t [<.trr/.cpt/...>] (traj.trr) (Opt.) Full precision trajectory: trr cpt tng -e [<.edr>] (ener.edr) (Opt.) Energy file Options to specify input/output files: -ref [<.trr/.cpt/...>] (rotref.trr) (Opt.) Full precision trajectory: trr cpt tng Options to specify output files: -po [<.mdp>] (mdout.mdp) grompp input file with MD parameters -pp [<.top>] (processed.top) (Opt.) Topology file -o [<.tpr>] (topol.tpr) Portable xdr run input file -imd [<.gro>] (imdgroup.gro) (Opt.) Coordinate file in Gromos-87 format Other options: -[no]v (no) Be loud and noisy -time <real> (-1) Take frame at or first after this time. -[no]rmvsbds (yes) Remove constant bonded interactions with virtual sites -maxwarn <int> (0) Number of allowed warnings during input processing. Not for normal use and may generate unstable systems -[no]zero (no) Set parameters for bonded interactions without defaults to zero instead of generating an error -[no]renum (yes) Renumber atomtypes and minimize number of atomtypes
gmx mdrun
gmx mdrun is the main computational chemistry engine within GROMACS. It performs Molecular Dynamics simulations, but it can also perform Stochastic Dynamics, Energy Minimization, test particle insertion or (re)calculation of energies. Normal mode analysis is another option.
SYNOPSIS gmx mdrun [-s [<.tpr>]] [-cpi [<.cpt>]] [-table [<.xvg>]] [-tablep [<.xvg>]] [-tableb [<.xvg> [...]]] [-rerun [<.xtc/.trr/...>]] [-ei [<.edi>]] [-multidir [<dir> [...]]] [-awh [<.xvg>]] [-membed [<.dat>]] [-mp [<.top>]] [-mn [<.ndx>]] [-o [<.trr/.cpt/...>]] [-x [<.xtc/.tng>]] [-cpo [<.cpt>]] [-c [<.gro/.g96/...>]] [-e [<.edr>]] [-g [<.log>]] [-dhdl [<.xvg>]] [-field [<.xvg>]] [-tpi [<.xvg>]] [-tpid [<.xvg>]] [-eo [<.xvg>]] [-devout [<.xvg>]] [-runav [<.xvg>]] [-px [<.xvg>]] [-pf [<.xvg>]] [-ro [<.xvg>]] [-ra [<.log>]] [-rs [<.log>]] [-rt [<.log>]] [-mtx [<.mtx>]] [-if [<.xvg>]] [-swap [<.xvg>]] [-deffnm <string>] [-xvg <enum>] [-dd <vector>] [-ddorder <enum>] [-npme <int>] [-nt <int>] [-ntmpi <int>] [-ntomp <int>] [-ntomp_pme <int>] [-pin <enum>] [-pinoffset <int>] [-pinstride <int>] [-gpu_id <string>] [-gputasks <string>] [-[no]ddcheck] [-rdd <real>] [-rcon <real>] [-dlb <enum>] [-dds <real>] [-gcom <int>] [-nb <enum>] [-nstlist <int>] [-[no]tunepme] [-pme <enum>] [-pmefft <enum>] [-bonded <enum>] [-[no]v] [-pforce <real>] [-[no]reprod] [-cpt <real>] [-[no]cpnum] [-[no]append] [-nsteps <int>] [-maxh <real>] [-replex <int>] [-nex <int>] [-reseed <int>] OPTIONS Options to specify input files: -s [<.tpr>] (topol.tpr) Portable xdr run input file -cpi [<.cpt>] (state.cpt) (Opt.) Checkpoint file -table [<.xvg>] (table.xvg) (Opt.) xvgr/xmgr file -tablep [<.xvg>] (tablep.xvg) (Opt.) xvgr/xmgr file -tableb [<.xvg> [...]] (table.xvg) (Opt.) xvgr/xmgr file -rerun [<.xtc/.trr/...>] (rerun.xtc) (Opt.) Trajectory: xtc trr cpt gro g96 pdb tng -ei [<.edi>] (sam.edi) (Opt.) ED sampling input -multidir [<dir> [...]] (rundir) (Opt.) Run directory -awh [<.xvg>] (awhinit.xvg) (Opt.) xvgr/xmgr file -membed [<.dat>] (membed.dat) (Opt.) Generic data file -mp [<.top>] (membed.top) (Opt.) Topology file -mn [<.ndx>] (membed.ndx) (Opt.) Index file Options to specify output files: -o [<.trr/.cpt/...>] (traj.trr) Full precision trajectory: trr cpt tng -x [<.xtc/.tng>] (traj_comp.xtc) (Opt.) Compressed trajectory (tng format or portable xdr format) -cpo [<.cpt>] (state.cpt) (Opt.) Checkpoint file -c [<.gro/.g96/...>] (confout.gro) Structure file: gro g96 pdb brk ent esp -e [<.edr>] (ener.edr) Energy file -g [<.log>] (md.log) Log file -dhdl [<.xvg>] (dhdl.xvg) (Opt.) xvgr/xmgr file -field [<.xvg>] (field.xvg) (Opt.) xvgr/xmgr file -tpi [<.xvg>] (tpi.xvg) (Opt.) xvgr/xmgr file -tpid [<.xvg>] (tpidist.xvg) (Opt.) xvgr/xmgr file -eo [<.xvg>] (edsam.xvg) (Opt.) xvgr/xmgr file -devout [<.xvg>] (deviatie.xvg) (Opt.) xvgr/xmgr file -runav [<.xvg>] (runaver.xvg) (Opt.) xvgr/xmgr file -px [<.xvg>] (pullx.xvg) (Opt.) xvgr/xmgr file -pf [<.xvg>] (pullf.xvg) (Opt.) xvgr/xmgr file -ro [<.xvg>] (rotation.xvg) (Opt.) xvgr/xmgr file -ra [<.log>] (rotangles.log) (Opt.) Log file -rs [<.log>] (rotslabs.log) (Opt.) Log file -rt [<.log>] (rottorque.log) (Opt.) Log file -mtx [<.mtx>] (nm.mtx) (Opt.) Hessian matrix -if [<.xvg>] (imdforces.xvg) (Opt.) xvgr/xmgr file -swap [<.xvg>] (swapions.xvg) (Opt.) xvgr/xmgr file Other options: -deffnm <string> Set the default filename for all file options -xvg <enum> (xmgrace) xvg plot formatting: xmgrace, xmgr, none -dd <vector> (0 0 0) Domain decomposition grid, 0 is optimize -ddorder <enum> (interleave) DD rank order: interleave, pp_pme, cartesian -npme <int> (-1) Number of separate ranks to be used for PME, -1 is guess -nt <int> (0) Total number of threads to start (0 is guess) -ntmpi <int> (0) Number of thread-MPI ranks to start (0 is guess) -ntomp <int> (0) Number of OpenMP threads per MPI rank to start (0 is guess) -ntomp_pme <int> (0) Number of OpenMP threads per MPI rank to start (0 is -ntomp) -pin <enum> (auto) Whether mdrun should try to set thread affinities: auto, on, off -pinoffset <int> (0) The lowest logical core number to which mdrun should pin the first thread -pinstride <int> (0) Pinning distance in logical cores for threads, use 0 to minimize the number of threads per physical core -gpu_id <string> List of unique GPU device IDs available to use -gputasks <string> List of GPU device IDs, mapping each PP task on each node to a device -[no]ddcheck (yes) Check for all bonded interactions with DD -rdd <real> (0) The maximum distance for bonded interactions with DD (nm), 0 is determine from initial coordinates -rcon <real> (0) Maximum distance for P-LINCS (nm), 0 is estimate -dlb <enum> (auto) Dynamic load balancing (with DD): auto, no, yes -dds <real> (0.8) Fraction in (0,1) by whose reciprocal the initial DD cell size will be increased in order to provide a margin in which dynamic load balancing can act while preserving the minimum cell size. -gcom <int> (-1) Global communication frequency -nb <enum> (auto) Calculate non-bonded interactions on: auto, cpu, gpu -nstlist <int> (0) Set nstlist when using a Verlet buffer tolerance (0 is guess) -[no]tunepme (yes) Optimize PME load between PP/PME ranks or GPU/CPU (only with the Verlet cut-off scheme) -pme <enum> (auto) Perform PME calculations on: auto, cpu, gpu -pmefft <enum> (auto) Perform PME FFT calculations on: auto, cpu, gpu -bonded <enum> (auto) Perform bonded calculations on: auto, cpu, gpu -[no]v (no) Be loud and noisy -pforce <real> (-1) Print all forces larger than this (kJ/mol nm) -[no]reprod (no) Try to avoid optimizations that affect binary reproducibility -cpt <real> (15) Checkpoint interval (minutes) -[no]cpnum (no) Keep and number checkpoint files -[no]append (yes) Append to previous output files when continuing from checkpoint instead of adding the simulation part number to all file names -nsteps <int> (-2) Run this number of steps (-1 means infinite, -2 means use mdp option, smaller is invalid) -maxh <real> (-1) Terminate after 0.99 times this time (hours) -replex <int> (0) Attempt replica exchange periodically with this period (steps) -nex <int> (0) Number of random exchanges to carry out each exchange interval (N^3 is one suggestion). -nex zero or not specified gives neighbor replica exchange. -reseed <int> (-1) Seed for replica exchange, -1 is generate a seed