Gaussian on ARC: Difference between revisions

From RCSWiki
Jump to navigation Jump to search
Line 311: Line 311:
* as an '''environmental variable''',  
* as an '''environmental variable''',  
* and as a directive in the '''Default.Route''' file.
* and as a directive in the '''Default.Route''' file.
The examples below show how to control '''Linda''' runs using the '''command line options''', so  
The examples below show how to control '''Linda''' runs using the '''command line options''', so  
please make sure that you do not try to pass '''conflicting control information'''  
please make sure that you do not try to pass '''conflicting control information'''  
Line 316: Line 317:
This may lead to unpredictable behaviour.
This may lead to unpredictable behaviour.


The example job scripts below show how to setup jobs on the '''cpu2019''', '''parallel''', and '''lattice'''  
The job scripts below show how to setup jobs on  
partitions on ARC.
the '''cpu2019''', '''parallel''', and '''lattice''' partitions on ARC.
These partitions have special fast network interconnect that is suitable for multi-node computations.
These partitions have special fast network interconnect that is suitable for multi-node computations.
The example computation from above, the '''dimer.com''' input is simple and short, so it does not provide
any significant computational demand that would benefit from running on multiple nodes.
The timing table below demonstrate that:
<pre>
#Nodes  Total #CPUs    Time(s)        Comment
----------------------------------------------------------------------------------------------------
    1            1      29.4        A serial computation.
    1            10      11.5   
    1            20        9.9        The computation is already saturated with resources.
    1            40      32.0        Bringing more workers hurts the performance already.
    2            80      60.0        Adding more "remote" workers hurt the performance even more.
----------------------------------------------------------------------------------------------------
</pre>
Clearly, one has to '''evaluate the benefits''' of using more resources before committing to a specific computational configuration.





Revision as of 20:21, 17 March 2021

Introduction

Gaussian is a commercial software package for electronic structure modelling. The University of Calgary has acquired a site license for the Linux source code for Gaussian 16 and the TCP Linda 9 software that allows for parallel execution of Gaussian 16 on multiple compute nodes.

We are also licensed for the Microsoft Windows version of the graphical pre- and post-processing program GaussView 6. Note, however, that we do not have a Linux vesion of the software, so, GaussView cannot be run on ARC. If you use a Microsoft Windows desktop or laptop computer and have been granted access to the software after agreeing to the license conditions, GaussView 6 can be downloaded from ARC, as mentioned below.

Here we concentrate on using Gaussian 16 on ARC, but, the software can also be installed on other Linux-based machines at the University of Calgary.

General information

  • g16 command line options:
http://gaussian.com/options/
  • Linda:
http://gaussian.com/lindaprod/
  • Google search on Gaussian input file format:
https://www.google.com/search?client=firefox-b-e&q=gaussian+input+file+format


  • Compute Canada Gaussian Errors article:
https://docs.computecanada.ca/wiki/Gaussian_error_messages

Licensing and access

Although the University of Calgary has a Gaussian 16 site license, access to the software is only made available to those researchers who are able to confirm that they can abide to the conditions of a license agreement. The license agreement can be downloaded from

/global/software/gaussian/20190311_Gaussian_License_Updated-Calgary-G16_GVW6_Linda.pdf 

on ARC.

If you would like access to run Gaussian 16 on ARC or download GaussView 16 for use on a Microsoft Windows computer located at the University of Calgary, please send an email to support@hpc.ucalgary.ca with a subject line of the form: Gaussian access request (your_ARC_user_name) with the body of the email including a copy of the statement:

    ------------------------------------------
I have read the license agreement
20190311_Gaussian_License_Updated-Calgary-G16_GVW6_Linda.pdf
in its entirety and agree to abide by the conditions set forth in that document.
These include, in part, that:

  - I will not use the Gaussian software to compete with Gaussian Inc. or
    provide assistance to its competitors.

  - I will not copy the Gaussian 16 or Linda software, nor make it
    available to anyone else.

  - I will only copy the GaussViewW Version 6 software to a computer
    under my control and will remove it when I leave the University of
    Calgary.  I will not make the GaussView software available to anyone
    else.

  - I will acknowledge Gaussian Inc., as described in section 10 of the
    agreement, in publications based on results obtained from using the
    Gaussian software.

  - I will notify Research Computing Services if there is any change
    that would void the agreement, such as leaving the University of
    Calgary or collaborating with a Gaussian competitor.

Signed,
   Your typed signature
--------------------------------------------------

After your email has been received and approved, your user name will be added to the g16 group on ARC, which is used to control access to the directory containing the software.

Look under /global/software/gaussian .

Installing GaussView 6.0 for Windows

The licensing terms for the GaussView 6.0 software require that it is installed on a University of Calgary owned and controlled computers only. If you have a Windows laptop or a workstation that is centrally managed by the UofC IT department, you can install GaussView on it yourself using the Software Centre on the computer. Look for GaussView 6.0 in the Software Centre.

Using Gaussian 16 on ARC

Running Gaussian batch jobs

Researchers using Gaussian on ARC are expected to be generally familiar with Gaussian capabilities, input file format and the use of checkpoint files.

Like other calculations on ARC systems, Gaussian is run by submitting an appropriate script for batch scheduling using the sbatch command. For more information about submitting jobs, see Running jobs article.

Sample scripts for running Gaussian 16 on ARC will be supplied once testing of the software is complete and installed on ARC under /global/software/gaussian .

Gaussian 16 modules

Currently there are two software modules on ARC that provide Gaussian16. You can see them useing the module avail:

$ module avail
...

$ module avail Gaussian

-------------- /global/software/Modules/3.2.10/modulefiles ----------------
Gaussian16/b01-nehalem 
Gaussian16/b01-skylake

There are two kinds of compute nodes in ARC. The newer nodes with Intel Skylake CPUs are in the cpu2019, apophis-bf, razi-bf, and pawson-bf partitions. The older nodes, legacy nodes have older Intel Nehalem CPUs, these are in lattice and parallel partitions.

Gaussian on ARC was compiled for these two different types of Intel CPUs separately to provide maximum performance on each of the CPU kinds. So, the Gaussian16/b01-nehalem module has to be loaded when submitting a job to the legacy partitions, and the Gaussian16/b01-skylake should be used when the job is sent to the newer partition. When the partition is not specified, the job goes to the default partitions with newer CPUs.


Once the module is loaded it provides access to the g16 executable program.

$ module load Gaussian16/b01-nehalem

The module, however, does not need to be loaded on the login node, it has to be loaded on the compute node that is going to work on your computation. The best place to load the module is the job script for your computation.

Running a Gaussian Job

To run your computation on ARC cluster you have to have two files: (1) a Gaussian .com input file and (2) SLURM job script .slurm. Put or prepare the files in the directory dedicated to the computation, dimer for example:

$ cd dimer

$ ls -l
-rw-r--r-- 1 drozmano drozmano    429 May  5 13:29 dimer.com
-rw-r--r-- 1 drozmano drozmano   1452 May  5 13:24 dimer.slurm

To submit the jobs simply

$ sbatch dimer.slurm
Submitted batch job 5527893

The number printed out during submission is a job ID of you job that can be used to monitor its state.

Like this:

$ squeue -j 5527893
JOBID      USER        STATE     PARTITION  TIME_LIMIT  TIME   NODES TASKS CPUS  MIN_MEMORY REASON      NODELIST
5527894    drozmano    RUNNING   apophis-bf 1:00:00     0:03   1     1     40    180G       None        fc6

The output of the squeue command may look different for you, depending on your settings.

You can also get more information about the state of the job with

$ arc.job-info 5527893
....

You will have to replace the number with the actual job ID of your job.

Input files

The Gaussian input file is a text file describing the geometry of your system as well as specifications of the computation you are going to perform. You have to consult the Gaussian manual and tutorials to create it.

Below is a geometry optimization run for a water dimer, dimer.com:

%Chk=dimer.chk
#p b3lyp/6-31+G(d,p) opt=(Z-Matrix) iop(1/7=30) int=ultrafine EmpiricalDispersion=GD3

water dimer, B3LYP-D3/6-31+G(d,p) opt tight, Cs, int=ultrafine

0 1
O1
H2  1  r2
H3  1  r3  2  a3
X4  2  1.0  1  90.0  3  180.0
O5  2  r5  4  a5  1  180.0
H6  5  r6  2  a6  4  d6
H7  5  r6  2  a6  4  -d6

r2=0.9732       
r3=0.9641      
r5=1.9128    
r6=0.9659   
a3=105.9    
a5=83.1         
a6=112.1       
d6=59.6        

The dimer.com input file provides instructions for the Gaussian16 software.

Now you need to create a file for the ARC cluster's job manager, SLURM, and explain

  1. what resources the computation needs and
  2. how to do it.

An example job script, that can be used as a template is shown below.

The SLURM job script, dimer.slurm:

#! /bin/bash
# ========================================================================
#SBATCH --job-name=g16_test

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=40
#SBATCH --mem=180gb
#SBATCH --time=0-01:00:00

# You only need to specify the partition when you want to run the job
# on non-default partition, such as
# a legacy one: lattice, parallel, cpu2013
# or big memory parition: bigmem.
# with
#    #SBATCH --partition=parallel

# Note, that you have to adjust
# the memory request, #CPUs, and time accordingly.

# ========================================================================
skylake=true 
if [[ `grep "avx512" /proc/cpuinfo` == "" ]]; then skylake=false; fi

if $skylake; then 
        module load Gaussian16/b01-skylake
else
        module load Gaussian16/b01-nehalem
fi

NCPUS=$SLURM_CPUS_PER_TASK

export GAUSS_CDEF="0-`expr $NCPUS - 1`"
export GAUSS_MDEF="${SLURM_MEM_PER_NODE}MB"
export GAUSS_SCRDIR="/scratch/$SLURM_JOBID"

echo "=================================================="
echo "       g16 binary: `which g16`"
echo "Memory definition: $GAUSS_MDEF"
echo "   Number of CPUs: $NCPUS"
echo "  CPUs definition: $GAUSS_CDEF"
echo "Scratch directory: $GAUSS_SCRDIR"
echo "=================================================="

# ========================================================================
# Run Gaussian here

g16 dimer.com
# ========================================================================

Generally, job scripts do not have to be that long and elaborate. This specific script does extra work for you:

  1. Determines if the job runs on a newer or older node and loads proper version of Gaussian for it.
  2. Sets the number of CPUs, amount of memory and the scratch directory for Gaussian based on the resource request from SLURM.
  3. Prints out the accepted settings for the run.

and then it runs Gaussian as the last command.

Output files

The run will produce three more files: dimer.chk, dimer.log, and slurm-5527893.out.

$ ls -l
-rw-r--r-- 1 drozmano drozmano 983040 May  6 10:35 dimer.chk
-rw-r--r-- 1 drozmano drozmano    429 May  5 13:29 dimer.com
-rw-r--r-- 1 drozmano drozmano 119739 May  6 10:35 dimer.log
-rw-r--r-- 1 drozmano drozmano   1452 May  5 13:24 dimer.slurm
-rw-r--r-- 1 drozmano drozmano    280 May  6 10:34 slurm-5527893.out

The .chk file is a Gaussian check point file that we requested in the input .com file, it can be used to restart or continue the computations from it. The Gaussian output goes into the .log file. Look into the .log file for your results.

The slurm-...out file is the intercept of what would be printed on screen during the job run time. When the job runs there is no screen and the output is captured and saved into this file. The number in the file name is the job ID of the job, which is printed out when you submit the job with the sbatch command.

Defining your own system

You can start by using these two example files on ARC to run your first job. If everything goes smoothly you can try defining your own system and changing the resource requests to match your specific case.

Running Gaussian across multiple nodes

Some type of computations can be spread over several compute nodes when using Gaussian16. Gaussian is using its own custom add-on to the main Gaussian16 program, called Linda. Linda is a separate product and it has to be purchased and licensed separately. ARC cluster does have the Linda add-on installed and available for use.

To spread computation across several nodes Linda logs into additional nodes using SSH and starts additional instances of Gaussian16 on them. The main instance of Gaussian16 on the first node assumes control of the additional instances. While the additional instances can provide additional computational capacity, communication time required for the master instance to send the work and receive the results back can be so long, that the communication burden can make the mutli-node computation slower than a single node computation. You have to know that the method you are going to use can run well on several nodes and test the performance benefit before you commit to running a large number of multi-node jobs.

Gaussian computational options can be specified in any of 4 ways (in the order of precedence):

  • as a % link in the input .com file,
  • as a command line option for the g16 program,
  • as an environmental variable,
  • and as a directive in the Default.Route file.

The examples below show how to control Linda runs using the command line options, so please make sure that you do not try to pass conflicting control information using the other methods at the same time. This may lead to unpredictable behaviour.

The job scripts below show how to setup jobs on the cpu2019, parallel, and lattice partitions on ARC. These partitions have special fast network interconnect that is suitable for multi-node computations.


The example computation from above, the dimer.com input is simple and short, so it does not provide any significant computational demand that would benefit from running on multiple nodes. The timing table below demonstrate that:

#Nodes   Total #CPUs     Time(s)        Comment
----------------------------------------------------------------------------------------------------
     1             1       29.4         A serial computation.
     1            10       11.5     
     1            20        9.9         The computation is already saturated with resources.
     1            40       32.0         Bringing more workers hurts the performance already.
     2            80       60.0         Adding more "remote" workers hurt the performance even more.
----------------------------------------------------------------------------------------------------

Clearly, one has to evaluate the benefits of using more resources before committing to a specific computational configuration.


Skylake CPUs, cpu2019 partition

The default partitions in ARC have 40 CPUs and 180GB of allocatable memory. These are the best compute nodes to run on. The example job script below sets up a Gaussian16 job using 2 compute nodes with 40 CPUs on each, 80 CPUs in total.

job.slurm:

#! /bin/bash
#=========================================================================================
#SBATCH --job-name=g16-test

#SBATCH --nodes=2
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=40
#SBATCH --mem=180gb
#SBATCH --time=0-01:00:00

#=========================================================================================
module load Gaussian16/b01-skylake

NCPUS=$SLURM_CPUS_PER_TASK
HOSTS=$(echo `scontrol show hostnames` | sed -e "s/\ /,/g")
MEM="${SLURM_MEM_PER_NODE}MB"

export GAUSS_SCRDIR="/scratch/$SLURM_JOBID"

echo "=================================================="
echo "       g16 binary: `which g16`"
echo "Memory definition: $MEM"
echo "    CPUs per node: $NCPUS"
echo "            Nodes: $HOSTS"
echo "Scratch directory: $GAUSS_SCRDIR"
echo "=================================================="

#=========================================================================================
INPUT="dimer.com"
CHKPT="dimer.chk"
LOG="dimer.log"

g16 -p="$NCPUS" -m=$MEM -w="$HOSTS" -y=$CHKPT < $INPUT >& $LOG
#=========================================================================================
# Clean up the scratch location.
rm -rf $GAUSS_SCRDIR/*

Nehalem CPUs, parallel partition

The parallel partition contains nodes that have 12 CPUs and 22GB of allocatable memory. These nodes have CPUs from 2011 and the CPU architecture is called Nehalem. These CPUs are about 50% performance of the cpu2019 CPUs. Also, the code for these CPUs does not support features of the more modern CPUs, so Gaussian16 had to be compiled specifically for them, that is why the jobs should be using a different Gaussian16 module. The example job script below sets up a Gaussian16 job using 4 compute nodes with 12 CPUs on each, 48 CPUs in total.

job.slurm:

#! /bin/bash
#=========================================================================================
#SBATCH --job-name=g16-test

#SBATCH --nodes=4
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=12
#SBATCH --mem=20gb
#SBATCH --time=0-01:00:00

#SBATCH --partition=parallel

#=========================================================================================
module load Gaussian16/b01-nehalem

NCPUS=$SLURM_CPUS_PER_TASK
HOSTS=$(echo `scontrol show hostnames` | sed -e "s/\ /,/g")
MEM="${SLURM_MEM_PER_NODE}MB"

export GAUSS_SCRDIR="/scratch/$SLURM_JOBID"

echo "=================================================="
echo "       g16 binary: `which g16`"
echo "Memory definition: $MEM"
echo "    CPUs per node: $NCPUS"
echo "            Nodes: $HOSTS"
echo "Scratch directory: $GAUSS_SCRDIR"
echo "=================================================="

#=========================================================================================
INPUT="dimer.com"
CHKPT="dimer.chk"
LOG="dimer.log"

g16 -p="$NCPUS" -m=$MEM -w="$HOSTS" -y=$CHKPT < $INPUT >& $LOG
#=========================================================================================
# Clean up the scratch location.
rm -rf $GAUSS_SCRDIR/*
#=========================================================================================

Nehalem CPUs, lattice partition

The lattice partition contains nodes that have 8 CPUs and 10GB of allocatable memory. These nodes have CPUs from 2010 and are similar to the CPUs in the parallel partition. The example job script below sets up a Gaussian16 job using 6 compute nodes with 8 CPUs on each, 48 CPUs in total.

job.slurm:

#! /bin/bash
#=========================================================================================
#SBATCH --job-name=g16-test

#SBATCH --nodes=6
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=11gb
#SBATCH --time=0-01:00:00

#SBATCH --partition=lattice

#=========================================================================================
module load Gaussian16/b01-nehalem

NCPUS=$SLURM_CPUS_PER_TASK
HOSTS=$(echo `scontrol show hostnames` | sed -e "s/\ /,/g")
MEM="${SLURM_MEM_PER_NODE}MB"

export GAUSS_SCRDIR="/scratch/$SLURM_JOBID"

echo "=================================================="
echo "       g16 binary: `which g16`"
echo "Memory definition: $MEM"
echo "    CPUs per node: $NCPUS"
echo "            Nodes: $HOSTS"
echo "Scratch directory: $GAUSS_SCRDIR"
echo "=================================================="

#=========================================================================================
INPUT="../dimer.com"
CHKPT="dimer.chk"

g16 -p="$NCPUS" -m=$MEM -w="$HOSTS" -y=$CHKPT $INPUT 
#=========================================================================================
# Clean up the scratch location.
rm -rf $GAUSS_SCRDIR/*

Gaussian 16 and GPUs

The number of GPUs and GPU nodes on ARC is relatively small and the benefit of using GPUs for Gaussian 16 installed on ARC does not seem significant enough to justify the use of the limited resources. Thus, we neither recommend nor support Gaussian jobs using GPUs at this moment.

Support

Please send any questions regarding using Gaussian on ARC to support@hpc.ucalgary.ca.