GROMACS: Difference between revisions
(→Misc) |
|||
Line 94: | Line 94: | ||
= Misc = | = Misc = | ||
== Performance == | |||
Performance measurements for '''GROMACS 2019.6''' on the 2019 compute nodes using | |||
different parallelization options and number of CPUs. | |||
The '''bilayer512''' system of ~105000 atoms was simulated for 100000 steps. | |||
<pre> | |||
---------------------------------------------------------------------- | |||
#CPUs Node Processes Threads Wall Time Performance | |||
per node per proc (s) (ns/day) | |||
---------------------------------------------------------------------- | |||
10 1 1 10 119.0 7.26 | |||
20 1 1 20 65.9 13.11 | |||
40 1 1 40 37.6 22.97 | |||
10 1 10 1 126.4 6.83 | |||
20 1 20 1 78.0 11.08 | |||
40 1 40 1 45.3 19.07 | |||
40 1 20 2 48.3 17.90 | |||
40 1 10 4 43.9 19.68 | |||
36 1 6 6 49.4 17.50 | |||
80 2 10 4 30.5 28.34 | |||
80 2 20 2 32.6 26.53 | |||
80 2 40 1 30.3 28.49 | |||
120 3 40 1 20.8 41.51 | |||
160 4 40 1 19.2 44.95 | |||
---------------------------------------------------------------------- | |||
</pre | |||
== Selected GROMACS commands == | == Selected GROMACS commands == |
Revision as of 20:00, 21 May 2020
General
- GROMACS home site: http://www.gromacs.org/
GROMACS is a versatile package to perform molecular dynamics,
i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.
Using GROMACS on ARC
Researchers using GROMACS on ARC are expected to be generally familiar with its capabilities, input file types and their formats and the use of checkpoint files to restart symulations.
Like other calculations on ARC systems, GROMACS is run by submitting an appropriate script for batch scheduling using the sbatch command. For more information about submitting jobs, see Running jobs article.
GROMACS modules
Currently there are several software modules on ARC that provide different versions of GROMACS. The versions differ in the release date as well as in the kind of the CPU architecture the software is compiled for.
You can see them using the module
command:
$ module avail gromacs ----------- /global/software/Modules/3.2.10/modulefiles --------- gromacs/2016.3-gnu gromacs/2018.0-gnu gromacs/2019.6-nehalem gromacs/2019.6-skylake gromacs/5.0.7-gnu
The names of the modules give hints on the specific version of GROMACS they provide access to.
- The gnu suffix indicates that those versions have been compiled with GNU GCC compiler.
- In these specific cases, GCC 4.8.5.
- GROMACS 2019.6 was compiled using GCC 7.3.0 for two different CPU kinds, the old kind, nehalem, and the new kind, skylake.
The nehalem module should be used on compute nodes before 2019, and the skylake module is for node from 2019 and up.
- All GROMACS versions provided by all the modules have support for GPU computations, even though it may not be practical to run it on GPU nodes due to limited resources.
A module has to be loaded before GROMACS can be used on ARC. Like this:
$ gmx --version bash: gmx: command not found... $ module load gromacs/2019.6-nehalem $ gmx --version :-) GROMACS - gmx, 2019.6 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen ..... ..... GROMACS version: 2019.6 Precision: single Memory model: 64 bit MPI library: none OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64) GPU support: CUDA SIMD instructions: SSE4.1 FFT library: fftw-3.3.7-sse2-avx RDTSCP usage: enabled TNG support: enabled Hwloc support: hwloc-1.11.8 Tracing support: disabled C compiler: /global/software/gcc/gcc-7.3.0/bin/gcc GNU 7.3.0 C compiler flags: -msse4.1 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast C++ compiler: /global/software/gcc/gcc-7.3.0/bin/g++ GNU 7.3.0 C++ compiler flags: -msse4.1 -std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast CUDA compiler: /global/software/cuda/cuda-10.0.130/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2018 NVIDIA Corporation;Built on Sat_Aug_25_21:08:01_CDT_2018;Cuda compilation tools, release 10.0, V10.0.130 CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;; ;-msse4.1;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast; CUDA driver: 9.10 CUDA runtime: N/A
Running a GROMACS Job
To run your simulation on ARC cluster you have to have: (1) a set of GROMACS input files and (2) a SLURM job script .slurm.
Place you input files for a simulation into a separate directory and prepare an appropriate job script for it.
Misc
Performance
Performance measurements for GROMACS 2019.6 on the 2019 compute nodes using different parallelization options and number of CPUs.
The bilayer512 system of ~105000 atoms was simulated for 100000 steps.
---------------------------------------------------------------------- #CPUs Node Processes Threads Wall Time Performance per node per proc (s) (ns/day) ---------------------------------------------------------------------- 10 1 1 10 119.0 7.26 20 1 1 20 65.9 13.11 40 1 1 40 37.6 22.97 10 1 10 1 126.4 6.83 20 1 20 1 78.0 11.08 40 1 40 1 45.3 19.07 40 1 20 2 48.3 17.90 40 1 10 4 43.9 19.68 36 1 6 6 49.4 17.50 80 2 10 4 30.5 28.34 80 2 20 2 32.6 26.53 80 2 40 1 30.3 28.49 120 3 40 1 20.8 41.51 160 4 40 1 19.2 44.95 ---------------------------------------------------------------------- </pre == Selected GROMACS commands == === gmx === <pre> SYNOPSIS gmx [-[no]h] [-[no]quiet] [-[no]version] [-[no]copyright] [-nice <int>] [-[no]backup] OPTIONS Other options: -[no]h (no) Print help and quit -[no]quiet (no) Do not print common startup info or quotes -[no]version (no) Print extended version information and quit -[no]copyright (yes) Print copyright information on startup -nice <int> (19) Set the nicelevel (default depends on command) -[no]backup (yes) Write backups if output files exist Additional help is available on the following topics: commands List of available commands selections Selection syntax and usage To access the help, use 'gmx help <topic>'. For help on a command, use 'gmx help <command>'.
gmx grompp
Preprocess input files.
$ gmx help grompp SYNOPSIS gmx grompp [-f [<.mdp>]] [-c [<.gro/.g96/...>]] [-r [<.gro/.g96/...>]] [-rb [<.gro/.g96/...>]] [-n [<.ndx>]] [-p [<.top>]] [-t [<.trr/.cpt/...>]] [-e [<.edr>]] [-ref [<.trr/.cpt/...>]] [-po [<.mdp>]] [-pp [<.top>]] [-o [<.tpr>]] [-imd [<.gro>]] [-[no]v] [-time <real>] [-[no]rmvsbds] [-maxwarn <int>] [-[no]zero] [-[no]renum] OPTIONS Options to specify input files: -f [<.mdp>] (grompp.mdp) grompp input file with MD parameters -c [<.gro/.g96/...>] (conf.gro) Structure file: gro g96 pdb brk ent esp tpr -r [<.gro/.g96/...>] (restraint.gro) (Opt.) Structure file: gro g96 pdb brk ent esp tpr -rb [<.gro/.g96/...>] (restraint.gro) (Opt.) Structure file: gro g96 pdb brk ent esp tpr -n [<.ndx>] (index.ndx) (Opt.) Index file -p [<.top>] (topol.top) Topology file -t [<.trr/.cpt/...>] (traj.trr) (Opt.) Full precision trajectory: trr cpt tng -e [<.edr>] (ener.edr) (Opt.) Energy file Options to specify input/output files: -ref [<.trr/.cpt/...>] (rotref.trr) (Opt.) Full precision trajectory: trr cpt tng Options to specify output files: -po [<.mdp>] (mdout.mdp) grompp input file with MD parameters -pp [<.top>] (processed.top) (Opt.) Topology file -o [<.tpr>] (topol.tpr) Portable xdr run input file -imd [<.gro>] (imdgroup.gro) (Opt.) Coordinate file in Gromos-87 format Other options: -[no]v (no) Be loud and noisy -time <real> (-1) Take frame at or first after this time. -[no]rmvsbds (yes) Remove constant bonded interactions with virtual sites -maxwarn <int> (0) Number of allowed warnings during input processing. Not for normal use and may generate unstable systems -[no]zero (no) Set parameters for bonded interactions without defaults to zero instead of generating an error -[no]renum (yes) Renumber atomtypes and minimize number of atomtypes
gmx mdrun
gmx mdrun is the main computational chemistry engine within GROMACS. It performs Molecular Dynamics simulations, but it can also perform Stochastic Dynamics, Energy Minimization, test particle insertion or (re)calculation of energies. Normal mode analysis is another option.
SYNOPSIS gmx mdrun [-s [<.tpr>]] [-cpi [<.cpt>]] [-table [<.xvg>]] [-tablep [<.xvg>]] [-tableb [<.xvg> [...]]] [-rerun [<.xtc/.trr/...>]] [-ei [<.edi>]] [-multidir [<dir> [...]]] [-awh [<.xvg>]] [-membed [<.dat>]] [-mp [<.top>]] [-mn [<.ndx>]] [-o [<.trr/.cpt/...>]] [-x [<.xtc/.tng>]] [-cpo [<.cpt>]] [-c [<.gro/.g96/...>]] [-e [<.edr>]] [-g [<.log>]] [-dhdl [<.xvg>]] [-field [<.xvg>]] [-tpi [<.xvg>]] [-tpid [<.xvg>]] [-eo [<.xvg>]] [-devout [<.xvg>]] [-runav [<.xvg>]] [-px [<.xvg>]] [-pf [<.xvg>]] [-ro [<.xvg>]] [-ra [<.log>]] [-rs [<.log>]] [-rt [<.log>]] [-mtx [<.mtx>]] [-if [<.xvg>]] [-swap [<.xvg>]] [-deffnm <string>] [-xvg <enum>] [-dd <vector>] [-ddorder <enum>] [-npme <int>] [-nt <int>] [-ntmpi <int>] [-ntomp <int>] [-ntomp_pme <int>] [-pin <enum>] [-pinoffset <int>] [-pinstride <int>] [-gpu_id <string>] [-gputasks <string>] [-[no]ddcheck] [-rdd <real>] [-rcon <real>] [-dlb <enum>] [-dds <real>] [-gcom <int>] [-nb <enum>] [-nstlist <int>] [-[no]tunepme] [-pme <enum>] [-pmefft <enum>] [-bonded <enum>] [-[no]v] [-pforce <real>] [-[no]reprod] [-cpt <real>] [-[no]cpnum] [-[no]append] [-nsteps <int>] [-maxh <real>] [-replex <int>] [-nex <int>] [-reseed <int>] OPTIONS Options to specify input files: -s [<.tpr>] (topol.tpr) Portable xdr run input file -cpi [<.cpt>] (state.cpt) (Opt.) Checkpoint file -table [<.xvg>] (table.xvg) (Opt.) xvgr/xmgr file -tablep [<.xvg>] (tablep.xvg) (Opt.) xvgr/xmgr file -tableb [<.xvg> [...]] (table.xvg) (Opt.) xvgr/xmgr file -rerun [<.xtc/.trr/...>] (rerun.xtc) (Opt.) Trajectory: xtc trr cpt gro g96 pdb tng -ei [<.edi>] (sam.edi) (Opt.) ED sampling input -multidir [<dir> [...]] (rundir) (Opt.) Run directory -awh [<.xvg>] (awhinit.xvg) (Opt.) xvgr/xmgr file -membed [<.dat>] (membed.dat) (Opt.) Generic data file -mp [<.top>] (membed.top) (Opt.) Topology file -mn [<.ndx>] (membed.ndx) (Opt.) Index file Options to specify output files: -o [<.trr/.cpt/...>] (traj.trr) Full precision trajectory: trr cpt tng -x [<.xtc/.tng>] (traj_comp.xtc) (Opt.) Compressed trajectory (tng format or portable xdr format) -cpo [<.cpt>] (state.cpt) (Opt.) Checkpoint file -c [<.gro/.g96/...>] (confout.gro) Structure file: gro g96 pdb brk ent esp -e [<.edr>] (ener.edr) Energy file -g [<.log>] (md.log) Log file -dhdl [<.xvg>] (dhdl.xvg) (Opt.) xvgr/xmgr file -field [<.xvg>] (field.xvg) (Opt.) xvgr/xmgr file -tpi [<.xvg>] (tpi.xvg) (Opt.) xvgr/xmgr file -tpid [<.xvg>] (tpidist.xvg) (Opt.) xvgr/xmgr file -eo [<.xvg>] (edsam.xvg) (Opt.) xvgr/xmgr file -devout [<.xvg>] (deviatie.xvg) (Opt.) xvgr/xmgr file -runav [<.xvg>] (runaver.xvg) (Opt.) xvgr/xmgr file -px [<.xvg>] (pullx.xvg) (Opt.) xvgr/xmgr file -pf [<.xvg>] (pullf.xvg) (Opt.) xvgr/xmgr file -ro [<.xvg>] (rotation.xvg) (Opt.) xvgr/xmgr file -ra [<.log>] (rotangles.log) (Opt.) Log file -rs [<.log>] (rotslabs.log) (Opt.) Log file -rt [<.log>] (rottorque.log) (Opt.) Log file -mtx [<.mtx>] (nm.mtx) (Opt.) Hessian matrix -if [<.xvg>] (imdforces.xvg) (Opt.) xvgr/xmgr file -swap [<.xvg>] (swapions.xvg) (Opt.) xvgr/xmgr file Other options: -deffnm <string> Set the default filename for all file options -xvg <enum> (xmgrace) xvg plot formatting: xmgrace, xmgr, none -dd <vector> (0 0 0) Domain decomposition grid, 0 is optimize -ddorder <enum> (interleave) DD rank order: interleave, pp_pme, cartesian -npme <int> (-1) Number of separate ranks to be used for PME, -1 is guess -nt <int> (0) Total number of threads to start (0 is guess) -ntmpi <int> (0) Number of thread-MPI ranks to start (0 is guess) -ntomp <int> (0) Number of OpenMP threads per MPI rank to start (0 is guess) -ntomp_pme <int> (0) Number of OpenMP threads per MPI rank to start (0 is -ntomp) -pin <enum> (auto) Whether mdrun should try to set thread affinities: auto, on, off -pinoffset <int> (0) The lowest logical core number to which mdrun should pin the first thread -pinstride <int> (0) Pinning distance in logical cores for threads, use 0 to minimize the number of threads per physical core -gpu_id <string> List of unique GPU device IDs available to use -gputasks <string> List of GPU device IDs, mapping each PP task on each node to a device -[no]ddcheck (yes) Check for all bonded interactions with DD -rdd <real> (0) The maximum distance for bonded interactions with DD (nm), 0 is determine from initial coordinates -rcon <real> (0) Maximum distance for P-LINCS (nm), 0 is estimate -dlb <enum> (auto) Dynamic load balancing (with DD): auto, no, yes -dds <real> (0.8) Fraction in (0,1) by whose reciprocal the initial DD cell size will be increased in order to provide a margin in which dynamic load balancing can act while preserving the minimum cell size. -gcom <int> (-1) Global communication frequency -nb <enum> (auto) Calculate non-bonded interactions on: auto, cpu, gpu -nstlist <int> (0) Set nstlist when using a Verlet buffer tolerance (0 is guess) -[no]tunepme (yes) Optimize PME load between PP/PME ranks or GPU/CPU (only with the Verlet cut-off scheme) -pme <enum> (auto) Perform PME calculations on: auto, cpu, gpu -pmefft <enum> (auto) Perform PME FFT calculations on: auto, cpu, gpu -bonded <enum> (auto) Perform bonded calculations on: auto, cpu, gpu -[no]v (no) Be loud and noisy -pforce <real> (-1) Print all forces larger than this (kJ/mol nm) -[no]reprod (no) Try to avoid optimizations that affect binary reproducibility -cpt <real> (15) Checkpoint interval (minutes) -[no]cpnum (no) Keep and number checkpoint files -[no]append (yes) Append to previous output files when continuing from checkpoint instead of adding the simulation part number to all file names -nsteps <int> (-2) Run this number of steps (-1 means infinite, -2 means use mdp option, smaller is invalid) -maxh <real> (-1) Terminate after 0.99 times this time (hours) -replex <int> (0) Attempt replica exchange periodically with this period (steps) -nex <int> (0) Number of random exchanges to carry out each exchange interval (N^3 is one suggestion). -nex zero or not specified gives neighbor replica exchange. -reseed <int> (-1) Seed for replica exchange, -1 is generate a seed