Abaqus: Difference between revisions

From RCSWiki
Jump to navigation Jump to search
Line 78: Line 78:


== Full node example ==
== Full node example ==
When running on a full compute node, specify '''--mem=0''' to request all the associated memory on the node. Note that when using the cpu2019 partition (40-core nodes), a full-node Abaqus job will take 23 license tokens.
When running on a full compute node, you have to request all the CPUs available on that node.  
For historical reasons the ARC cluster has different kinds of compute nodes.
Compute nodes with similar hardware capabilities are combined into '''partitions'''.
Different compute nodes are places into different partitions, and a partition contains only nodes with similar specifications.
Therefore, requesting resources from different ARC partitions will require requesting different number of CPUs, or '''tasks''', according to the partition specification.


This script is available on ARC as '''/global/software/abaqus/scripts/abaqus_2019_full_node.slurm''' .
You can find out the number of CPUs and available memory per computer node with the
$ arc.hardware
 
command.
 
Note that when using the '''cpu2019''' partition (40-core nodes), a full-node Abaqus job will take '''23 license tokens'''.
 
Job script
'''abaqus_cpu2019.slurm''' .
<syntaxhighlight lang=bash>
<syntaxhighlight lang=bash>
#!/bin/bash
#!/bin/bash

Revision as of 21:12, 10 January 2023

Introduction

Abaqus (external link) is a commercial suite of programs for finite element analysis, including static, dynamic and thermal analysis of materials, with sophisticated options for handling contacts and nonlinear materials.

Typically, researchers will install Abaqus on their own computers to develop models in a graphical user interface (Abaqus CAE) and then run simulations that exceed their local hardware capabilities on ARC. Output from runs on ARC is then transferred back to the researchers' own computers for visualization.

The software can be downloaded, upon approval, from the Information Technologies Software Distribution web site. A student version, for Microsoft Windows computers only, with limitations on model size is available directly from Dassault Systèmes (external link).

Abaqus is available to all U of C researchers with an ARC account, but, due to the limited number of licenses it is important to be thoroughly familiar with the licensing restrictions outlined in the next section.

Licensing considerations

For many years, Information Technologies has provided a limited number of license tokens for research and teaching versions of the Abaqus software, sometimes supplemented by contributions from researchers. The software contract is typically renewed annually in August. If you are interested in contributing to the pool of licenses, you can write to the IT Help Desk itsupport@ucalgary.ca and ask that your email be redirected to the IT software librarian.

The discussion that follows relates only to the research version of the software. Note that the conditions of use of the teaching licenses prohibits them from being used for research projects.

At the time of this writing in May 2020, there are only 83 research license tokens available. The number of tokens available at a given time can be seen by running the lmstat command on ARC:

/global/software/abaqus/2019_licensing_only/linux_a64/code/bin/lmstat -c 27001@abaqus.ucalgary.ca -a

The number of license tokens, t, used for a given job depends on the number of CPU cores, c, requested for Abaqus to use according to the formula t=integer part of (5 * c^0.422). This formula has been implemented on the Abaqus Token Calculator web page (external link), but, here is a table showing some examples.

Cores Tokens
8 12
12 14
16 16
20 17
24 19
32 21
36 22
40 23
48 25

Generally speaking, unless you have a small model to process, the more cores you use, the more efficiently the license tokens will be used. Similarly, using the fastest hardware available will provide the most value for a given number of license tokens. With those considerations in mind, using a full 40-core compute node, selected by specifying the cpu2019 partition in your batch job (see example scripts below), is preferred . However, as there is often a shortage of license tokens, you will likely have to use just part of a compute node.

Current Limitations

  • Currently, Abaqus user subroutines do not work on ARC.
The functionality does not work for some unknown technical reasons.
  • Standard Abaqus should work within a single computational node.
  • Multi-node Abaqus computations are not supported and do not work either.
Such computations are not recommended due to licensing limitations, and there is no immediate plan to support multi-node Abaqus jobs on ARC.

Running Abaqus batch jobs on ARC

Researchers using Abaqus on ARC are expected to be generally familiar with Abaqus capabilities, input file format and the use of restart files.

Like other calculations on ARC systems, Abaqus is run by submitting an appropriate script for batch scheduling using the sbatch command. For more information about submitting jobs, see the ARC Running Jobs page.

The scripts below can serve as a template for your own batch job scripts .

Full node example

When running on a full compute node, you have to request all the CPUs available on that node. For historical reasons the ARC cluster has different kinds of compute nodes. Compute nodes with similar hardware capabilities are combined into partitions. Different compute nodes are places into different partitions, and a partition contains only nodes with similar specifications. Therefore, requesting resources from different ARC partitions will require requesting different number of CPUs, or tasks, according to the partition specification.

You can find out the number of CPUs and available memory per computer node with the

$ arc.hardware

command.

Note that when using the cpu2019 partition (40-core nodes), a full-node Abaqus job will take 23 license tokens.

Job script abaqus_cpu2019.slurm .

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --tasks-per-node=40
#SBATCH --cpus-per-task=1
#SBATCH --time=24:00:00
#SBATCH --mem=64gb
#SBATCH --partition=cpu2019

# 2019-02-12 DSP - Full-node Abaqus test on Arc (cpu2019 partition) for Abaqus 2019

# For the initial run, RECOVER should be empty.
# For subsequent restarts RECOVER="recover" - for Abaqus Explicit runs only.
RECOVER=""

# Specify the job name for the Abaqus input (with .inp suffix) 
ABAQUS_COMMAND_FILE=t1-std.inp 

# Strip off the .inp suffix to give the job name to use on the Abaqus command line. 
ABAQUS_JOB=`basename $ABAQUS_COMMAND_FILE .inp` 

# Override the default scratch location:
# /global/software/abaqus/2019/solver/linux_a64/SMA/site/custom_v6.env"
SCRATCH=$PWD

# Specify the version of the software to use
ABAQUS_VERSION="2019"
ABAQUS_HOME="/global/software/abaqus"
ABAQUS=${ABAQUS_HOME}/Commands/abq2019

# --------------  Report some basic information about this run  --------------

echo "Running on host: " `hostname`
echo "Current working directory is `pwd`"
echo "Using $SCRATCH for ABAQUS temporary files"

echo "Node list: SLURM_JOB_NODELIST :"
echo "---------------------"
echo $SLURM_JOB_NODELIST
echo "---------------------"

CORES=${SLURM_NTASKS}

echo "Running on $CORES cores."

# Unsetting SLURM_GTIDS is said to be necessary to work with Abaqus' Platform MPI.
# That was not sufficient to get Abaqus to work on multiple nodes.
unset SLURM_GTIDS

# Try to avoid "Internal Error: Cannot initialize RDMA protocol"
export MPI_IC_ORDER=tcp

ENV_FILE=abaqus_v6.env

cat << EOF > ${ENV_FILE}
ask_delete = OFF
mp_file_system = (SHARED, SHARED)
EOF

NODE_LIST=$(scontrol show hostname ${SLURM_NODELIST} | sort -u)

mp_host_list="["
for host in ${NODE_LIST}; do
    mp_host_list="${mp_host_list}['$host', ${SLURM_CPUS_ON_NODE}],"
done

mp_host_list=$(echo ${mp_host_list} | sed -e "s/,$/]/")

echo "mp_host_list=${mp_host_list}"  >> ${ENV_FILE}

$ABAQUS job=${ABAQUS_JOB} cpus=${CORES} scratch=${SCRATCH} ${RECOVER} interactive

echo "Finished at `date`"

Partial node example

Note that when running on less than a complete compute node, the default Abaqus parallel processing mode based on MPI (Message Passing Interface) does not work properly on ARC. Instead, use mp_mode=THREADS as shown in this example. The number of cores to use is specified by the --cpus-per-task parameter and you should also specify a non-zero value for the --mem (memory) parameter.

This script is available on ARC as /global/software/abaqus/scripts/abaqus_2019_partial_node.slurm .

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=20
#SBATCH --time=24:00:00
#SBATCH --mem=20g
#SBATCH --partition=cpu2019

# 2019-02-12 DSP - Partial-node Abaqus test on Arc (cpu2019 partition) for Abaqus 2019

# When running on a partial node, use --cpus-per-task and specify --mem

# Note this is using the Abaqus configuration parameter mp_mode=THREADS

# For the initial run, RECOVER should be empty.
# For subsequent restarts RECOVER="recover" - for Abaqus Explicit runs only.
RECOVER=""

# Specify the job name for the Abaqus input (with .inp suffix) 
ABAQUS_COMMAND_FILE=t1-std.inp 

# Strip off the .inp suffix to give the job name to use on the Abaqus command line. 
ABAQUS_JOB=`basename $ABAQUS_COMMAND_FILE .inp` 

# Override the default scratch location:
# /global/software/abaqus/2019/solver/linux_a64/SMA/site/custom_v6.env"
SCRATCH=$PWD

# Specify the version of the software to use
ABAQUS_VERSION="2019"
ABAQUS_HOME="/global/software/abaqus"
ABAQUS=${ABAQUS_HOME}/Commands/abq2019

# --------------  Report some basic information about this run  --------------

echo "Running on host: " `hostname`
echo "Current working directory is `pwd`"
echo "Using $SCRATCH for ABAQUS temporary files"

echo "Node list: SLURM_JOB_NODELIST :"
echo "---------------------"
echo $SLURM_JOB_NODELIST
echo "---------------------"

CORES=${SLURM_CPUS_PER_TASK}

echo "Running on $CORES cores."

# Unsetting SLURM_GTIDS is said to be necessary to work with Abaqus' Platform MPI.
# That was not sufficient to get Abaqus to work on multiple nodes.
unset SLURM_GTIDS

# Try to avoid "Internal Error: Cannot initialize RDMA protocol"
export MPI_IC_ORDER=tcp

ENV_FILE=abaqus_v6.env

cat << EOF > ${ENV_FILE}
ask_delete = OFF
mp_file_system = (SHARED, SHARED)
EOF

NODE_LIST=$(scontrol show hostname ${SLURM_NODELIST} | sort -u)

mp_host_list="["
for host in ${NODE_LIST}; do
    mp_host_list="${mp_host_list}['$host', ${SLURM_CPUS_ON_NODE}],"
done

mp_host_list=$(echo ${mp_host_list} | sed -e "s/,$/]/")

echo "mp_host_list=${mp_host_list}"  >> ${ENV_FILE}

echo "mp_mode=THREADS" >> ${ENV_FILE}

$ABAQUS job=${ABAQUS_JOB} cpus=${CORES} scratch=${SCRATCH} ${RECOVER} interactive

echo "Finished at `date`"

Support

Please send any questions regarding using Abaqus on ARC to support@hpc.ucalgary.ca.