How to find nodenames allocated by SLURM for a job
Background
When a distributed job that is planned to run on multiple nodes requests several compute nodes from SLURM the program that is going to perform the computations
needs to know the names of the compute nodes which are allocated to the job by SLURM to use those nodes.
If the program is based on the distributed MPI library then the distribution of the computational processed to the nodes is done by the MPI program launcher,
mpirun
or mpiexec
. Thus, these launchers need to know the list of the allocated nodes.
If the computational code (program) was build on ARC and compiled against the OpenMPI library provided on ARC ( openmpi/4.1.1-gnu
at the moment of writing),
then the launcher is aware about ARC's SLURM scheduler and can obtain the list of nodes directly from SLURM automatically.