ANSYS: Difference between revisions

From RCSWiki
Jump to navigation Jump to search
m (using syntax highlight and moved categories down)
 
(59 intermediate revisions by 2 users not shown)
Line 26: Line 26:
Using the fastest hardware available will provide the most value a given number of license tokens, so, using the 40-core compute nodes, selected by specifying the cpu2019 partition in your batch job (see example scripts below), is preferred. However, if there is a shortage of license tokens, you may use just part of a compute node or compute nodes from the older legacy partitions, such as parallel.
Using the fastest hardware available will provide the most value a given number of license tokens, so, using the 40-core compute nodes, selected by specifying the cpu2019 partition in your batch job (see example scripts below), is preferred. However, if there is a shortage of license tokens, you may use just part of a compute node or compute nodes from the older legacy partitions, such as parallel.


= Running ANSYS Fluent batch jobs =
= ANSYS Fluent on ARC =
Researchers using ANSYS on ARC are expected to be generally familiar with ANSYS capabilities, input file format and the use of restart files.
Researchers using ANSYS on ARC are expected to be generally familiar with ANSYS capabilities, input file format and the use of restart files.


You can use
You can use
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
module avail ansys
$ module avail ansys
--------------------- /global/software/Modules/4.6.0/modulefiles -------------------------------
ansys/2019r2  ansys/2020r2  ansys/2021r1  ansys/2024r1
</syntaxhighlight>
</syntaxhighlight>
to see the versions of the ANSYS software that have been installed on ARC.
to see the versions of the ANSYS software that have been installed on ARC.
'''Ansys versions''':
* <code>ansys/2024r1</code> -- supported. Works. <code>-mpi=intel -ic=ib</code>
* <code>ansys/2021r1</code> -- Does not work due to technical problems.
* <code>ansys/2020r2</code> -- supported. Works. <code>-mpi=openmpi -ic=eth</code>
* <code>ansys/2019r2</code> -- supported. Works. <code>-mpi=openmpi -ic=eth</code>
== Creating a Fluent input file ==
== Creating a Fluent input file ==
After preparing your model, at the point where you are ready to run a Fluent solver, you save the case and data files and transfer them to ARC. In addition to those files, to run your model on ARC you need an input file containing Fluent text interface commands to specify such parameters as the solver to use, the number of time steps, the frequency of output and other simulation controls.  
After preparing your model, at the point where you are ready to run a Fluent solver, you save the case and data files and transfer them to ARC. In addition to those files, to run your model on ARC you need an input file containing Fluent text interface commands to specify such parameters as the solver to use, the number of time steps, the frequency of output and other simulation controls.  
Line 82: Line 91:
</syntaxhighlight>
</syntaxhighlight>
Note that blank lines are significant for some commands.
Note that blank lines are significant for some commands.
== Slurm batch job script examples ==
 
Like other calculations on ARC systems, ANSYS software is run by submitting an appropriate script for batch scheduling using the sbatch command. For more information about submitting jobs, see the [[ARC_Cluster_Guide|ARC Cluster Guide]].
= Running ANSYS Fluent batch jobs on ARC =
 
Like other calculations on ARC systems, ANSYS software is run by submitting an appropriate script for batch scheduling using the sbatch command.  
For more information about submitting jobs, see the [[ARC_Cluster_Guide|ARC Cluster Guide]].


The scripts below can serve as a template for your own batch job scripts.
The scripts below can serve as a template for your own batch job scripts.
=== Full node example - cpu2019 partition ===
Please note that
different versions of '''ANSYS Fluent''' require different options depending on the version.
Different options are also required when '''Fluent''' is run on different partitions of the ARC cluster.
Typically, only the '''cpu2019''', '''cpu2021''', and the '''parallel''' partitions are recommended for '''Fluent'''.
 
The following examples, the input files, <code>elbow3.in</code> and <code>elbow3.cas</code>, are used.
They are available on ARC in the directory <code>/global/software/ansys/scripts</code>.
 
== Licenses ==
 
Note that when using '''40-CPU nodes''' of the '''cpu2019''' partition,
an ANSYS job that runs on '''N''' nodes will take '''(40 x N - 16)''' license tokens from the '''aa_r_hpc''' pool.
 
== Requesting Resources for Fluent jobs ==
 
When requesting memory for the run, ether request the amount you '''know''' the computation needs,
or you can use the '''4GB per 1 CPU''' rule of thumb.
With this rule you can request '''160GB of RAM when requesting 40 CPUs''' for your job.
 


When running on a full compute node, specify --mem=0 to request all the associated memory on the node. Note that when using the cpu2019 partition (40-core nodes), an n-node ANSYS job will take 40*n-16 license tokens from the aa_r_hpc pool.
Please always request '''full nodes''' for your jobs.
The following example, in ansys_2019r2_fluent_cpu2019_node.slurm , and the input files, elbow3.in and elbow3.cas are available on ARC in the directory /global/software/ansys/scripts .
This means that you have to request all CPUs on the nodes your job requests.
 
== Example Job Scripts ==
 
The best '''ARC's partitions''' to run ANSYS Fluent are '''cpu2019''', '''cpu2021''', and '''cpu2022'''.  
They have '''40''', '''48''', and '''52''' '''CPUs per node'''.
 
When running jobs on one of those partitions, the jobs should be requesting '''all the CPUs on the requested nodes'''.
 
Below are '''example job scripts''' to run '''Fluent 2024r1''' on these partitions.
These script can be accordingly changed to use '''older versions of Fluent''' installed on ARC by changing the '''ANSYS module'''.
 
=== Job script for <code>cpu2019</code> partition ===
 
<code>fluent2024r1_cpu2019.slurm</code>:
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
#!/bin/bash
#!/bin/bash
 
#-------------------------------------------------------------------------------------------
#SBATCH --time=00:10:00
#SBATCH --nodes=1
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=40
#SBATCH --ntasks-per-node=40
#SBATCH --mem=0
#SBATCH --cpus-per-task=1
#SBATCH --partition=cpu2019
#SBATCH --mem=128GB
#SBATCH --time=01:00:00
#SBATCH --partition=cpu2019,cpu2019-bf05


# Fluent job script for elbow example on 40-core ARC cpu2019 partition nodes.
# -----------------------------------------------------------------------------------------------
# You may change the time and nodes requests, but, leave ntasks-per-node=40 and mem=0
module load ansys/2024r1


# 2019-07-16 DSP - Updated for Fluent 2019R2 on ARC
# Create a node list so that Fluent knows which nodes to use.
HOSTLIST=hostlist_${SLURM_JOB_ID}
scontrol show hostnames > $HOSTLIST


# Define the run files and solver type:
# -----------------------------------------------------------------------------------------------
BASE=elbow3
INPUT=elbow3.in
INPUT=${BASE}.in
fluent 3ddp -g -t${SLURM_NTASKS} -cnf=${HOSTLIST} -mpi=intel -pib -i $INPUT
OUTPUT=${BASE}_${SLURM_JOB_ID}.out
</syntaxhighlight>
SOLVER="2d"


# Choose version of ANSYS Fluent to use:
=== Job script for <code>cpu2021</code> partition ===
module load ansys/2019r2


FLUENT=`which fluent`
<code>fluent2024r1_cpu2021.slurm</code>:
echo "Using Fluent: $FLUENT"
<syntaxhighlight lang="bash">
#!/bin/bash
#-------------------------------------------------------------------------------------------
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=48
#SBATCH --cpus-per-task=1
#SBATCH --mem=128GB
#SBATCH --time=01:00:00
#SBATCH --partition=cpu2021,cpu2021-bf24


echo "Current working directory is `pwd`"
# -----------------------------------------------------------------------------------------------
module load ansys/2024r1


# Create a node list so that Fluent knows which nodes to use.
# Create a node list so that Fluent knows which nodes to use.
HOSTLIST=hostlist_${SLURM_JOB_ID}
HOSTLIST=hostlist_${SLURM_JOB_ID}
scontrol show hostnames > $HOSTLIST
scontrol show hostnames > $HOSTLIST
echo "Created host list file $HOSTLIST"
echo "Running on hosts:"
cat $HOSTLIST


echo "Using $SLURM_NTASKS cores."
# -----------------------------------------------------------------------------------------------
INPUT=elbow3.in
fluent 3ddp -g -t${SLURM_NTASKS} -cnf=${HOSTLIST} -mpi=intel -pib -i $INPUT
</syntaxhighlight>
 
=== Job script for <code>cpu2022</code> partition ===


echo "Starting run at: `date`"
<code>fluent2024r1_cpu2022.slurm</code>:
<syntaxhighlight lang="bash">
#!/bin/bash
#-------------------------------------------------------------------------------------------
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=52
#SBATCH --cpus-per-task=1
#SBATCH --mem=128GB
#SBATCH --time=01:00:00
#SBATCH --partition=cpu2022,cpu2022-bf24


$FLUENT $SOLVER -g -t${SLURM_NTASKS} -ssh -cnf=${HOSTLIST} -i $INPUT > $OUTPUT 2>&1
# -----------------------------------------------------------------------------------------------
module load ansys/2024r1


echo "Job finished at: `date`"
# Create a node list so that Fluent knows which nodes to use.
HOSTLIST=hostlist_${SLURM_JOB_ID}
scontrol show hostnames > $HOSTLIST
 
# -----------------------------------------------------------------------------------------------
INPUT=elbow3.in
fluent 3ddp -g -t${SLURM_NTASKS} -cnf=${HOSTLIST} -mpi=intel -pib -i $INPUT
</syntaxhighlight>
</syntaxhighlight>


=== Legacy node example - parallel partition ===
= Timings and Benchmarks =
 
== Fluent Timings ==
 
These timings are provided for '''reference purpose'''.
They have been obtained on some example system that runs for about 10 minutes on 2 full compute nodes of the '''cpu2019''' partition.
 
=== 2024r1 timings ===
 
<pre>
Partition    #Nodes    #CPUs/Procs      Walltime          Fluent
---------------------------------------------------------------------------------------------------
cpu2019          1            40          15:00          2024r1
cpu2019          2            80          9:13          2024r1
cpu2019          3            120          7:22          2024r1
cpu2019          4            160          6.29          2024r1


Use the parallel partition only when the waiting time for cpu2019 nodes is comparable to the run time, as the cpu2019 partition nodes should run Fluent about twice as fast as the parallel partition nodes.
cpu2021          1            48          13:10          2024r1
cpu2021          2            96          8:10          2024r1
cpu2021          3            144          6:35          2024r1


The following example, in ansys_2019r2_fluent_parallel_node.slurm , and the input files, elbow3.in and elbow3.cas are available on ARC in the directory /global/software/ansys/scripts .
cpu2022          1            52          10:56          2024r1
<syntaxhighlight lang="bash">
cpu2022          2            104          6:56          2024r1
#!/bin/bash
cpu2022          3            156          5:57          2024r1


#SBATCH --time=00:10:00
cpu2023          1            64          10:25          2024r1
#SBATCH --nodes=2
cpu2023          2           128          7:18          2024r1
#SBATCH --ntasks-per-node=12
cpu2023          3            192          6:36          2024r1
#SBATCH --mem=0
</pre>
#SBATCH --partition=parallel


# Fluent job script for elbow example on 12-core ARC parallel partition nodes.
= Issues =
# You may change the time and nodes requests, but, leave ntasks-per-node=12 and mem=0
== Cleaning the system after Crashed Jobs ==


# 2019-07-16 DSP - Updated for Fluent 2019R2 on ARC
'''Update, 2021-01''': After a major update in December 2020, the job scheduler, SLURM, should be deleting these leftover processes automatically,
making the cleaning step unnecessary. This article is kept so far for historical and reference purposes.


# Define the run files and solver type:
BASE=elbow3
INPUT=${BASE}.in
OUTPUT=${BASE}_${SLURM_JOB_ID}.out
SOLVER="2d"


# Choose version of ANSYS Fluent to use:
There is problem with Fluent contaminating the cluster with '''leftover processes''' when a '''multi-node job crashes''' on the cluster.
module load ansys/2019r2


FLUENT=`which fluent`
The issue is in that Fluent uses its own set of MPI libraries which do not communicate with the job scheduler on ARC.
echo "Using Fluent: $FLUENT"
Therefore, the processes that are spread on the nodes when an MPI job is running are not known to SLURM and
SLURM cannot take care of them when the job finishes in a bad way (crashes).
If the job finishes in a good way, those processes just terminate normally and there is no problem.


echo "Current working directory is `pwd`"
To help with this issue, Fluent creates a script in the working directory which is called '''cleanup-fluent-….sh''' that must be run to clean up the system in case when the jobs terminates abnormally.
If the job finishes normally, this script is deleted automatically.
If the job crashes it never gets to the deletion step and the clean up script stays in the working directory.


# Create a node list so that Fluent knows which nodes to use.
The name of the script is '''cleanup-fluent-node-12345.sh''',
HOSTLIST=hostlist_${SLURM_JOB_ID}
where '''node''' is the node name and '''122345''' is a number somehow related to the run.
scontrol show hostnames > $HOSTLIST
echo "Created host list file $HOSTLIST"
echo "Running on hosts:"
cat $HOSTLIST


echo "Using $SLURM_NTASKS cores."
So, if you
* run '''multi-node Fluent''' jobs and
* your job '''crashes''' and
* you '''can see that script''' in the working directory of that job,


echo "Starting run at: `date`"
you have have to execute that script immediately to clean the system. Like this:
$ ./cleanup-fluent-....sh


$FLUENT $SOLVER -g -t${SLURM_NTASKS} -ssh  -pib -cnf=${HOSTLIST} -i $INPUT > $OUTPUT 2>&1
The clean up script deletes itself when you run it, this prevents any double cleaning of the system.


echo "Job finished at: `date`"
Please make sure that you '''take care of the leftover fluent processes''' after your jobs are done.
</syntaxhighlight>
This is a '''serious issue''',
For the legacy partitions, note the use of the -pib argument on the Fluent command line to indicate that InfiniBand networking is to be used.
as the leftover processes '''slow down other users jobs''' and will never die on their own.


= Support =
= Support =
Please send any questions regarding using ANSYS on ARC to support@hpc.ucalgary.ca.
Please send any questions regarding using ANSYS on ARC to support@hpc.ucalgary.ca.
= Links =
[[ARC Software pages]]


[[Category:ANSYS]]
[[Category:ANSYS]]
[[Category:ANSYS Fluent]]
[[Category:ANSYS CFX]]
[[Category:ANSYS Mechanical]]
[[Category:Software]]
[[Category:Software]]
[[Category:ARC]]
[[Category:ARC]]
{{Navbox ARC}}

Latest revision as of 17:35, 18 March 2024


Introduction

ANSYS (external link) is a commercial suite of programs for engineering simulation, including fluid dynamics (Fluent and CFX), structural analysis (ANSYS Mechanical) and electromagnetics/electronics software.

Typically, researchers will install ANSYS on their own computers to develop models in a graphical user interface and then run simulations that exceed their local hardware capabilities on ARC.

The software can be downloaded, upon approval, from the IT Software Distribution web site.

ANSYS is available to all U of C researchers with an ARC account, but, with licensing restrictions as outlined in the next section.= Licensing considerations =

For many years, Information Technologies has provided a limited number of license tokens for ANSYS software, sometimes supplemented by contributions from researchers. The software contract is typically renewed annually in August. If you are interested in contributing to the pool of licenses, you can write to the IT Help Desk itsupport@ucalgary.ca and ask that your email be redirected to the IT software librarian.

The discussion that follows relates only to the research version of the software. Note that the conditions of use of the teaching licenses prohibits them from being used for research projects.

At the time of this writing in May 2020, there are 50 basic academic licenses and 512 extended "HPC" license tokens available (with 256 of the latter reserved for a specific research group who purchased their own licenses). The number of tokens available at a given time can be seen by running the following commands on ARC:

module load ansys/2019r2
lmutil lmstat -c 1055@ansyslic.ucalgary.ca -a

For ANSYS Fluent, each job on ARC will use one token of the software feature "aa_r" in the lmstat output. In addition, one license token per core is used of the "aa_r_hpc" type for cores in excess of 16. So, for example, a job using a 40-core node from the cpu2019 partition will use one aa_r token and 24 aa_r_hpc tokens.

Using the fastest hardware available will provide the most value a given number of license tokens, so, using the 40-core compute nodes, selected by specifying the cpu2019 partition in your batch job (see example scripts below), is preferred. However, if there is a shortage of license tokens, you may use just part of a compute node or compute nodes from the older legacy partitions, such as parallel.

ANSYS Fluent on ARC

Researchers using ANSYS on ARC are expected to be generally familiar with ANSYS capabilities, input file format and the use of restart files.

You can use

$ module avail ansys
--------------------- /global/software/Modules/4.6.0/modulefiles -------------------------------
ansys/2019r2  ansys/2020r2  ansys/2021r1  ansys/2024r1

to see the versions of the ANSYS software that have been installed on ARC.

Ansys versions:

  • ansys/2024r1 -- supported. Works. -mpi=intel -ic=ib
  • ansys/2021r1 -- Does not work due to technical problems.
  • ansys/2020r2 -- supported. Works. -mpi=openmpi -ic=eth
  • ansys/2019r2 -- supported. Works. -mpi=openmpi -ic=eth

Creating a Fluent input file

After preparing your model, at the point where you are ready to run a Fluent solver, you save the case and data files and transfer them to ARC. In addition to those files, to run your model on ARC you need an input file containing Fluent text interface commands to specify such parameters as the solver to use, the number of time steps, the frequency of output and other simulation controls.

Typically the main difficulty in getting started with Fluent on ARC is figuring out what text interface commands correspond to the graphical interface commands with which you may be more familiar from using a desktop version of Fluent. At the Fluent command prompt, if you just hit enter the available commands will be shown, similar to:

adapt/                  file/                   report/
define/                 mesh/                   solve/
display/                parallel/               surface/
exit                    plot/                   views/

Entering one of those commands and then another enter will give sub-options:

> file

/file>
async-optimize?         read-case-data          start-journal
auto-save/              read-field-functions    start-transcript
binary-files?           read-journal            stop-journal
confirm-overwrite?      read-macros             stop-macro
define-macro            read-profile            stop-transcript
execute-macro           read-transient-table    transient-export/
export/                 set-batch-options       write-cleanup-script
import/                 show-configuration      write-field-functions
read-case               solution-files/         write-macros

So, for example, one can discover by exploring these menus that the commands to set the frequency with which data and case files can be automatically stored periodically during a long run are of the form:

/file/auto-save/data-frequency 1000
/file/auto-save/case-frequency if-case-is-modified

Here is an example of a complete text input file in which case and data files are read in, some parameters are set related to the storing of output, the solver is run and data and case files saved at the end of the run.

/file/read-case test.cas
/file/read-data test.dat

/file/confirm-overwrite no
/file/auto-save/data-frequency 1000
/file/auto-save/case-frequency if-case-is-modified
/file/auto-save/root-name test

/solve/dual-time-iterate
22200
150

/file/write-case test.%t.%i.cas
/file/write-data test.%t.%i.dat

Note that blank lines are significant for some commands.

Running ANSYS Fluent batch jobs on ARC

Like other calculations on ARC systems, ANSYS software is run by submitting an appropriate script for batch scheduling using the sbatch command. For more information about submitting jobs, see the ARC Cluster Guide.

The scripts below can serve as a template for your own batch job scripts. Please note that different versions of ANSYS Fluent require different options depending on the version. Different options are also required when Fluent is run on different partitions of the ARC cluster. Typically, only the cpu2019, cpu2021, and the parallel partitions are recommended for Fluent.

The following examples, the input files, elbow3.in and elbow3.cas, are used. They are available on ARC in the directory /global/software/ansys/scripts.

Licenses

Note that when using 40-CPU nodes of the cpu2019 partition, an ANSYS job that runs on N nodes will take (40 x N - 16) license tokens from the aa_r_hpc pool.

Requesting Resources for Fluent jobs

When requesting memory for the run, ether request the amount you know the computation needs, or you can use the 4GB per 1 CPU rule of thumb. With this rule you can request 160GB of RAM when requesting 40 CPUs for your job.


Please always request full nodes for your jobs. This means that you have to request all CPUs on the nodes your job requests.

Example Job Scripts

The best ARC's partitions to run ANSYS Fluent are cpu2019, cpu2021, and cpu2022. They have 40, 48, and 52 CPUs per node.

When running jobs on one of those partitions, the jobs should be requesting all the CPUs on the requested nodes.

Below are example job scripts to run Fluent 2024r1 on these partitions. These script can be accordingly changed to use older versions of Fluent installed on ARC by changing the ANSYS module.

Job script for cpu2019 partition

fluent2024r1_cpu2019.slurm:

#!/bin/bash
#-------------------------------------------------------------------------------------------
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=40
#SBATCH --cpus-per-task=1
#SBATCH --mem=128GB
#SBATCH --time=01:00:00
#SBATCH --partition=cpu2019,cpu2019-bf05

# -----------------------------------------------------------------------------------------------
module load ansys/2024r1

# Create a node list so that Fluent knows which nodes to use.
HOSTLIST=hostlist_${SLURM_JOB_ID}
scontrol show hostnames > $HOSTLIST

# -----------------------------------------------------------------------------------------------
INPUT=elbow3.in
fluent 3ddp -g -t${SLURM_NTASKS} -cnf=${HOSTLIST} -mpi=intel -pib -i $INPUT

Job script for cpu2021 partition

fluent2024r1_cpu2021.slurm:

#!/bin/bash
#-------------------------------------------------------------------------------------------
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=48
#SBATCH --cpus-per-task=1
#SBATCH --mem=128GB
#SBATCH --time=01:00:00
#SBATCH --partition=cpu2021,cpu2021-bf24

# -----------------------------------------------------------------------------------------------
module load ansys/2024r1

# Create a node list so that Fluent knows which nodes to use.
HOSTLIST=hostlist_${SLURM_JOB_ID}
scontrol show hostnames > $HOSTLIST

# -----------------------------------------------------------------------------------------------
INPUT=elbow3.in
fluent 3ddp -g -t${SLURM_NTASKS} -cnf=${HOSTLIST} -mpi=intel -pib -i $INPUT

Job script for cpu2022 partition

fluent2024r1_cpu2022.slurm:

#!/bin/bash
#-------------------------------------------------------------------------------------------
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=52
#SBATCH --cpus-per-task=1
#SBATCH --mem=128GB
#SBATCH --time=01:00:00
#SBATCH --partition=cpu2022,cpu2022-bf24

# -----------------------------------------------------------------------------------------------
module load ansys/2024r1

# Create a node list so that Fluent knows which nodes to use.
HOSTLIST=hostlist_${SLURM_JOB_ID}
scontrol show hostnames > $HOSTLIST

# -----------------------------------------------------------------------------------------------
INPUT=elbow3.in
fluent 3ddp -g -t${SLURM_NTASKS} -cnf=${HOSTLIST} -mpi=intel -pib -i $INPUT

Timings and Benchmarks

Fluent Timings

These timings are provided for reference purpose. They have been obtained on some example system that runs for about 10 minutes on 2 full compute nodes of the cpu2019 partition.

2024r1 timings

Partition    #Nodes    #CPUs/Procs       Walltime          Fluent
---------------------------------------------------------------------------------------------------
cpu2019           1             40          15:00          2024r1
cpu2019           2             80           9:13          2024r1
cpu2019           3            120           7:22          2024r1
cpu2019           4            160           6.29          2024r1

cpu2021           1             48          13:10          2024r1
cpu2021           2             96           8:10          2024r1
cpu2021           3            144           6:35          2024r1

cpu2022           1             52          10:56          2024r1
cpu2022           2            104           6:56          2024r1
cpu2022           3            156           5:57          2024r1

cpu2023           1             64          10:25          2024r1
cpu2023           2            128           7:18          2024r1
cpu2023           3            192           6:36          2024r1

Issues

Cleaning the system after Crashed Jobs

Update, 2021-01: After a major update in December 2020, the job scheduler, SLURM, should be deleting these leftover processes automatically, making the cleaning step unnecessary. This article is kept so far for historical and reference purposes.


There is problem with Fluent contaminating the cluster with leftover processes when a multi-node job crashes on the cluster.

The issue is in that Fluent uses its own set of MPI libraries which do not communicate with the job scheduler on ARC. Therefore, the processes that are spread on the nodes when an MPI job is running are not known to SLURM and SLURM cannot take care of them when the job finishes in a bad way (crashes). If the job finishes in a good way, those processes just terminate normally and there is no problem.

To help with this issue, Fluent creates a script in the working directory which is called cleanup-fluent-….sh that must be run to clean up the system in case when the jobs terminates abnormally. If the job finishes normally, this script is deleted automatically. If the job crashes it never gets to the deletion step and the clean up script stays in the working directory.

The name of the script is cleanup-fluent-node-12345.sh, where node is the node name and 122345 is a number somehow related to the run.

So, if you

  • run multi-node Fluent jobs and
  • your job crashes and
  • you can see that script in the working directory of that job,

you have have to execute that script immediately to clean the system. Like this:

$ ./cleanup-fluent-....sh

The clean up script deletes itself when you run it, this prevents any double cleaning of the system.

Please make sure that you take care of the leftover fluent processes after your jobs are done. This is a serious issue, as the leftover processes slow down other users jobs and will never die on their own.

Support

Please send any questions regarding using ANSYS on ARC to support@hpc.ucalgary.ca.

Links

ARC Software pages