Sample Job Scripts: Difference between revisions
Ian.percel (talk | contribs) (Update of basic slurm job script examples for different types of parallelism) |
Ian.percel (talk | contribs) mNo edit summary |
||
Line 54: | Line 54: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
== Job-Level Parallelism | == Job-Level Parallelism == | ||
<!--Also known as a ''task array'', an array job is a way to submit a whole set of jobs with one command. The individual jobs in the array are distinguished by an environment variable, <code>$SLURM_ARRAY_TASK_ID</code>, which is set to a different value for each instance of the job. The following example will create 10 tasks, with values of <code>$SLURM_ARRAY_TASK_ID</code> ranging from 1 to 10:--> | <!--Also known as a ''task array'', an array job is a way to submit a whole set of jobs with one command. The individual jobs in the array are distinguished by an environment variable, <code>$SLURM_ARRAY_TASK_ID</code>, which is set to a different value for each instance of the job. The following example will create 10 tasks, with values of <code>$SLURM_ARRAY_TASK_ID</code> ranging from 1 to 10:--> |
Revision as of 06:19, 3 February 2021
This page catalogues some typical job scripts for different methods of parallelizing code. These examples are not exhaustive and more examples can be found on individual software pages. However, these are typical of some of the most important styles of computation on an HPC system: Serial Computing, Shared Memory Parallelism, Distributed Memory Parallelism, Job-Level Parallelism and GPU Accelerated Computing.
Serial Computing
This script launches one process with one CPU core to run for 5 minutes with 1G of memory.
serial_job.slurm:
#!/bin/bash
#SBATCH --time=0-0:5
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=1G
~/anaconda3/bin/python -c "print(3+5)"
This script starts one process with four CPU cores.
In order for this to work, the OpenMP code arrayUpdate.c must be compiled with OpenMP support, e.g.
gcc -fopenmp arrayUpdate.c
sharedmemory_job.slurm:
#!/bin/bash
#SBATCH --time=0-1:5:0
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=1000M
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./arrayUpdate
Distributed Memory Parallelism
This example script launches four MPI processes, each with 1000 MB of memory.
distributedMemory_job.slurm:
#!/bin/bash
#SBATCH --time=1:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4 # number of MPI processes should match total ntasks
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=1000M # memory allocated for each cpu
mpiexec ./matrixProduct_program
Job-Level Parallelism
jobParallel_job.slurm:
#!/bin/bash
#SBATCH --time=1-0:0:0
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=1G
#SBATCH --array=1-10
./recon-all $SLURM_ARRAY_TASK_ID
GPU Accelerated Computing
gpu_job.slurm:
#!/bin/bash
#SBATCH --time=12-0:0:0
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=50000M
#SBATCH --gres=gpu:1
python tensorflow_example.py