MATLAB: Difference between revisions
mNo edit summary |
|||
(56 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
= Introduction = | |||
[http://www.mathworks.com/ MATLAB] is a general-purpose high-level programming package for numerical work such as linear algebra, signal processing and other calculations involving matrices or vectors of data. Visualization tools are also included for presentation of results. The basic MATLAB package is extended through add-on components including SIMULINK, and the Image Processing, Optimization, Neural Network, Signal Processing, Statistics and Wavelet Toolboxes, among others. | |||
The main purpose of this page is to show how to use MATLAB on the University of Calgary [[ ARC_Cluster_Guide |ARC (Advanced Research Computing) cluster]]. It is presumed that you already have an account an ARC and have read the material on reserving resources and [[Running jobs|running jobs]] with the Slurm job management system. | |||
= <span id="license"></span>Ways to run MATLAB - License considerations = | = <span id="license"></span>Ways to run MATLAB - License considerations = | ||
At the University of Calgary, Information Technologies has purchased a MATLAB Total Academic Headcount license that allows installation and use of MATLAB on central clusters, such as ARC, as well as on personal workstations throughout the University. Potentially thousands of instances of MATLAB can be run simultaneously, each checking out a license from a central license server. An alternative is to compile MATLAB code into a standalone application. When such an application is run, it does not need to contact the server for a license. This allows researchers to run their calculations on compatible hardware, not necessarily at the University of Calgary, such as on [https://docs.computecanada.ca/wiki/Getting_started Compute Canada clusters (external link)]. | |||
For information about installing MATLAB on your own computer, see the Information Technologies Knowledge Base [https://ucalgary.service-now.com/it?id=kb_article&sys_id=787d708213fdbec08246f7b2e144b0d4 article on MATLAB]. | |||
= Running MATLAB from command line = | |||
Before you begin using '''MATLAB''' on a cluster, you have to know how to run your '''MATLAB''' programs (codes) from the '''command line'''. | |||
The '''MATLAB''' code you want to run, say <code>my_code.m</code>, must define a function with the same name and should end with the <code>quit; end</code> commands | |||
<syntaxhighlight lang=matlab> | |||
function my_code(arg1, arg2, arg3) | |||
..... | |||
..... | |||
quit | |||
end | |||
</syntaxhighlight> | |||
to make sure that MATLAB exits at the end of the computation. | |||
You can run such code from the '''command line''' with a command: | |||
$ matlab -batch "my_code(1, 2, 3)" | |||
Note, that there is no extension <code>.m</code> (ending). | |||
If the function does not take any arguments, then this command will work too: | |||
$ matlab -batch my_code | |||
= Running MATLAB on the ARC cluster = | |||
Although it is possible to run MATLAB interactively, the expectation is that most calculations with MATLAB will be completed by submitting a batch job script to the Slurm job scheduler with the sbatch command. | |||
For many researchers, the main reason for using ARC for MATLAB-based calculations is to be able to run many instances at the same time. It is recommended in such cases that any parallel processing features be removed from the code and each instance of MATLAB be run on a single CPU core. It is also possible to run MATLAB on multiple cores in an attempt to speed up individual instances, but, this generally results in a less efficient use of the cluster hardware. In the sections that follow, serial and then parallel processing examples are shown. | |||
== Serial MATLAB example== | |||
For the purposes of illustration, suppose the following serial MATLAB code, in a file sawtooth.m, is to be run. If your code does not already have them, add a function statement at the beginning and matching end statement at the end as shown in the example. Other features of this example include calling a function with both numerical and string arguments, incorporating a Slurm environment variable into the MATLAB code and producing graphical output in a non-interactive environment. | |||
<syntaxhighlight lang=matlab> | |||
function sawtooth(nterms,nppcycle,ncycle,pngfilebase) | |||
% MATLAB file example to approximate a sawtooth | |||
% with a truncated Fourier expansion. | |||
% nterms = number of terms in expansion. | |||
% nppcycle = number of points per cycle. | |||
% ncycle = number of complete cycles to plot. | |||
% pngfilebase = base of file name for graph of results. | |||
% 2020-05-14 | |||
np=nppcycle*ncycle; | |||
fourbypi=4.0/pi; | |||
y(1:np)=pi/2.0; | |||
x(1:np)=linspace(-pi*ncycle,pi*ncycle,np); | |||
for k=1:nterms | |||
twokm=2*k-1; | |||
y=y-fourbypi*cos(twokm*x)/twokm^2; | |||
end | |||
% Prepare output | |||
% Construct the output file name from the base file name and number of terms | |||
% Also append the Slurm JOBID to keep file names unique from run to run. | |||
job=getenv('SLURM_JOB_ID') | |||
pngfile=strcat(pngfilebase,'_',num2str(nterms),'_',job) | |||
disp(['Writing file: ',pngfile,'.png']) | |||
fig=figure; | |||
plot(x,y); | |||
print(fig,pngfile,'-dpng'); | |||
quit | |||
end | |||
</syntaxhighlight> | |||
In preparation to run the <code>sawtooth.m</code> code, create a batch job script, <code>sawtooth.slurm</code> of the form: | |||
<syntaxhighlight lang=bash> | |||
#!/bin/bash | |||
# ============================================================================== | |||
#SBATCH --time=03:00:00 # Adjust this to match the walltime of your job | |||
#SBATCH --nodes=1 # For serial code, always specify just one node. | |||
#SBATCH --ntasks=1 # For serial code, always specify just one task. | |||
#SBATCH --cpus-per-task=1 # For serial code, always specify just one CPU per task. | |||
#SBATCH --mem=4000m # Adjust to match total memory required, in MB. | |||
# ============================================================================== | |||
# Sample batch job script for running a MATLAB function with both numerical and string arguments | |||
module load matlab/r2019b | |||
# Use -singleCompThread below for serial MATLAB code: | |||
# Function call: sawtooth(NTerms, NPPCycles, NCycle, PNGFileBase) | |||
matlab -singleCompThread -batch "sawtooth(100,20,3,'sawtooth')" | |||
# ============================================================================== | |||
</syntaxhighlight> | |||
Note that the above script uses the -batch option on the matlab command line. The MathWorks web page on [https://www.mathworks.com/help/matlab/ref/matlablinux.html running MATLAB on Linux (external link)] starting with Release 2019a of MATLAB, recommends using the -batch option for non-interactive use instead of the similar -r option that is recommended in interactive sessions. | |||
To submit the job to be executed, run: | |||
<pre> | |||
sbatch sawtooth.slurm | |||
</pre> | |||
The job should produce three output files: Slurm script output, MATLAB command output and a PNG file, all tagged with the Slurm Job ID. | |||
== Parallel MATLAB examples== | |||
MATLAB provides several ways of speeding up calculations through parallel processing. These include relying on internal parallelization in which multiple threads are used or by using explicit language features, such as parfor, to start up multiple workers on a compute node. Examples of both approaches are shown below. Using multiple compute nodes for a single MATLAB calculation, which depends on the MATLAB Parallel Server product, is not considered here as there has not been sufficient demand to configure that software on ARC. | |||
For many researchers, submitting many independent serial jobs is a better approach to efficient parallelization than using parallel programming in MATLAB itself. An example of job-based parallelism is also shown below. | |||
=== Thread-based parallel processing === | |||
First consider an example using multiple cores with MATLAB's built-in thread-based parallelization. | |||
Suppose the following code to calculate eigenvalues of a number of random matrices is in a file eig_thread_test.m . Note the use of the maxNumCompThreads function to control the number of threads (one thread per CPU core). For some years now, MathWorks has marked that function as deprecated, but, it still provides a useful limit to ensure that MATLAB doesn't use more cores than assigned by Slurm. | |||
<syntaxhighlight lang=matlab> | |||
function eig_thread_test(nthreads,matrix_size,nmatrices,results_file) | |||
% Calculate the absolute value of the maximum eigenvalue for each of a number of matrices | |||
% possibly using multiple threads. | |||
% nthreads = number of computational threads to use. | |||
% matrix_size = order of two-dimensional random matrix. | |||
% nmatrices = number of matrices to process. | |||
% results_file = name of file in which to save the maximum eigenvalues | |||
% 2020-05-25 | |||
matlab_ncores=feature('numcores') | |||
slurm_ncores_per_task=str2num(getenv('SLURM_CPUS_PER_TASK')) | |||
if(isempty(slurm_ncores_per_task)) | |||
slurm_ncores_per_task=1; | |||
disp('SLURM_CPUS_PER_TASK not set') | |||
end | |||
% Set number of computational threads to the minimum of matlab_ncores and slurm_ncores_per_task | |||
% Note Mathworks warns that the maxNumCompThreads function will | |||
% be removed in future versions of MATLAB. | |||
% Use only thread-based parallel processing | |||
intial_matlab_max_ncores = maxNumCompThreads(min([nthreads,slurm_ncores_per_task])); | |||
disp(['Using a maximum of ',num2str(maxNumCompThreads()),' computational threads.']) | |||
tic | |||
for i = 1:nmatrices | |||
e=eig(rand(matrix_size)); | |||
eigenvalues(i) = max(abs(e)); | |||
end | |||
toc | |||
save(results_file,'eigenvalues','-ascii') | |||
quit | |||
end | |||
</syntaxhighlight> | |||
Here is a job script, eig_thread_test.slurm that can be used to run the eig_thread_test.m code. | |||
The number of threads used for the calculation is controlled by specifying the --cpus-per-task parameter that Slurm uses to control the number of CPU cores assigned to the job. | |||
<syntaxhighlight lang=bash> | |||
#!/bin/bash | |||
#SBATCH --time=01:00:00 # Adjust this to match the walltime of your job | |||
#SBATCH --nodes=1 # Always specify just one node. | |||
#SBATCH --ntasks=1 # Specify just one task. | |||
#SBATCH --cpus-per-task=8 # The number of threads to use | |||
#SBATCH --mem=4000m # Adjust to match total memory required, in MB. | |||
# Sample batch job script for running a MATLAB function to test thread-based parallel processing features. | |||
# 2020-05-25 | |||
# Specify the name of the main MATLAB function to be run. | |||
# This would normally be the same as the MATLAB source code file name without a .m suffix). | |||
MAIN="eig_thread_test" | |||
# Define key parameters for the example calculation. | |||
NTHREADS=${SLURM_CPUS_PER_TASK} | |||
MATRIX_SIZE=10000 | |||
NMATRICES=10 | |||
RESULTS_FILE="maximum_eigenvalues_${SLURM_JOB_ID}.txt" | |||
# Contruct a complete function call to pass to MATLAB | |||
# Note, string arguments should appear to MATLAB enclosed in single quotes | |||
ARGS="($NTHREADS,$MATRIX_SIZE,$NMATRICES,'$RESULTS_FILE')" | |||
MAIN_WITH_ARGS=${MAIN}${ARGS} | |||
echo "Calling MATLAB function: ${MAIN_WITH_ARGS}" | |||
echo "Starting run at $(date)" | |||
echo "Running on compute node $(hostname)" | |||
echo "Running from directory $(pwd)" | |||
# Choose a version of MATLAB by loading a module: | |||
module load matlab/r2020a | |||
echo "Using MATLAB version: $(which matlab)" | |||
matlab -batch "${MAIN_WITH_ARGS}" > ${MAIN}_${SLURM_JOB_ID}.out 2>&1 | |||
echo "Finished run at $(date)" | |||
</syntaxhighlight> | |||
The above job can be submitted with | |||
<pre> | |||
sbatch eig_thread_test.slurm | |||
</pre> | |||
If assigned to one of the modern partitions (as opposed to the older single, lattice or parallel partitions) the job took about 16 minutes, about 4 times faster than a comparable serial job. Using 8 cores to obtain just a factor of four speed-up is not an efficient use of ARC, but, might be justified in some cases. In some cases, using multiple workers (as discussed in the next section) may be faster than using the same number of cores with thread-based parallelization. | |||
=== Explicit parallel processing using a pool of workers === | |||
Now consider the more complicated case of creating a pool of workers and using a parfor loop to explicitly parallelize a section of code. Each worker may use one or more cores through MATLAB's internal thread-based parallelization, as in the preceding example. Suppose the following code is in a file eig_parallel_test.m. | |||
<syntaxhighlight lang=matlab> | |||
function eig_parallel_test(nworkers,nthreads,matrix_size,nmatrices,results_file) | |||
% Calculate the absolute value of the maximum eigenvalue for each of a number of matrices | |||
% possibly using multiple threads and multiple MATLAB workers. | |||
% nworkers = number of MATLAB workers to use. | |||
% nthreads = number of threads per worker. | |||
% matrix_size = order of two-dimensional random matrix. | |||
% nmatrices = number of matrices to process. | |||
% results_file = name of file in which to save the maximum eigenvalues | |||
% 2020-05-25 | |||
matlab_ncores=feature('numcores') | |||
slurm_ncores_per_task=str2num(getenv('SLURM_CPUS_PER_TASK')) | |||
if(isempty(slurm_ncores_per_task)) | |||
slurm_ncores_per_task=1; | |||
disp('SLURM_CPUS_PER_TASK not set') | |||
end | |||
% Set number of computational threads to the minimum of matlab_ncores and slurm_ncores | |||
% Note Mathworks warns that the maxNumCompThreads function will | |||
% be removed in future versions of MATLAB. | |||
% Testing based on remarks at | |||
% https://www.mathworks.com/matlabcentral/answers/158192-maxnumcompthreads-hyperthreading-and-parpool | |||
% shows the maxNumCompThreads has to be called inside the parfor loop. | |||
tic | |||
if ( nworkers > 1 ) | |||
% Process with multiple workers | |||
% Check on properties of the local MATLAB cluster. | |||
% One can set properties such as c.NumThreads and c.NumWorkers | |||
parallel.defaultClusterProfile('local') | |||
c = parcluster() | |||
c.NumThreads=nthreads | |||
c.NumWorkers=nworkers | |||
% Create a pool of workers with the current cluster settings. | |||
% Note, testing without the nworkers argument showed a limit of 12 workers even if c.NumWorkers is defined. | |||
parpool(c,nworkers) | |||
ticBytes(gcp); | |||
parfor i = 1:nmatrices | |||
e=eig(rand(matrix_size)); | |||
eigenvalues(i) = max(abs(e)); | |||
end | |||
tocBytes(gcp) | |||
% Close down the pool. | |||
delete(gcp('nocreate')); | |||
else | |||
% Use only thread-based parallel processing | |||
intial_matlab_max_ncores = maxNumCompThreads(min([nthreads,slurm_ncores_per_task])) | |||
for i = 1:nmatrices | |||
e=eig(rand(matrix_size)); | |||
eigenvalues(i) = max(abs(e)); | |||
end | |||
end % nworkers test | |||
toc | |||
save(results_file,'eigenvalues','-ascii') | |||
quit | |||
end | |||
</syntaxhighlight> | |||
Of particular note in the preceding example is the section of lines (copied below) that creates a cluster object, c, and modifies the number of threads associated with this object (c.NumThreads=nthreads). In a similar way, one can modify the number of workers (c.NumWorkers=nworkers). Testing showed that if one then used the MATLAB gcp or parpool commands without arguments to create a pool of workers, at most 12 workers were created. However, it was found that by using parpool(c,nworkers), the requested number of workers would be started, even if nworkers > 12. | |||
<syntaxhighlight lang=matlab> | |||
parallel.defaultClusterProfile('local') | |||
c = parcluster() | |||
c.NumThreads=nthreads | |||
c.NumWorkers=nworkers | |||
parpool(c,nworkers) | |||
</syntaxhighlight> | |||
An example Slurm batch job script, eig_parallel_test.slurm, used to test the above code was: | |||
<syntaxhighlight lang=bash> | |||
#!/bin/bash | |||
#SBATCH --time=03:00:00 # Adjust this to match the walltime of your job | |||
#SBATCH --nodes=1 # Always specify just one node. | |||
#SBATCH --ntasks=1 # Always specify just one task. | |||
#SBATCH --cpus-per-task=8 # Choose --cpus-per-task to match the number of workers * threads per worker | |||
#SBATCH --mem=10000m # Adjust to match total memory required, in MB. | |||
# Sample batch job script for running a MATLAB function to test parallel processing features | |||
# 2020-05-25 | |||
# Specify the name of the main MATLAB function to be run. | |||
# This would normally be the same as the MATLAB source code file name without a .m suffix). | |||
MAIN="eig_parallel_test" | |||
# Define key parameters for the example calculation. | |||
NWORKERS=${SLURM_NTASKS} | |||
NTHREADS=${SLURM_CPUS_PER_TASK} | |||
MATRIX_SIZE=10000 | |||
NMATRICES=10 | |||
RESULTS_FILE="maximum_eigenvalues_${SLURM_JOB_ID}.txt" | |||
# Contruct a complete function call to pass to MATLAB | |||
# Note, string arguments should appear to MATLAB enclosed in single quotes | |||
ARGS="($NWORKERS,$NTHREADS,$MATRIX_SIZE,$NMATRICES,'$RESULTS_FILE')" | |||
MAIN_WITH_ARGS=${MAIN}${ARGS} | |||
echo "Calling MATLAB function: ${MAIN_WITH_ARGS}" | |||
echo "Starting run at $(date)" | |||
echo "Running on compute node $(hostname)" | |||
echo "Running from directory $(pwd)" | |||
# Choose a version of MATLAB by loading a module: | |||
module load matlab/r2020a | |||
echo "Using MATLAB version: $(which matlab)" | |||
matlab -batch "${MAIN_WITH_ARGS}" > ${MAIN}_${SLURM_JOB_ID}.out 2>&1 | |||
echo "Finished run at $(date)" | |||
</syntaxhighlight> | |||
Note that the Slurm SBATCH parameter --cpus-per-task, the total number of cores to use, should be the product of the number of workers and the threads per worker. (It might be argued that one more core should be requested beyond the product of workers and threads, to use for the main MATLAB process, but, for a fully parallelized code, the workers do the great bulk of the calculation and the main MATLAB process uses relatively little CPU time.) | |||
Variations on the above code were tested with many combinations of workers and threads per worker. It was found that if many jobs were started in close succession that some of the jobs failed to start properly. The problem can be avoided by introducing a short delay between job submissions, as illustrated in the section on job arrays below. | |||
Also note that in the above example, the number of cores requested is not well matched to the number of workers. If there are 8 workers processing 10 matrices and each worker gets assigned one matrix, there are two left over. The total time for 10 matrices (or for any number of matrices from 9 to 16) would be about double that for 8 matrices. For example, in testing the above code, the time for processing 10 matrices with 8 workers with just one computational thread was 1050 seconds, whereas processing just 8 matrices with 8 workers took only 560 seconds. | |||
=== Job-based parallel processing === | |||
Slurm provides a feature called job arrays that can sometimes be conveniently used to submit a large number of similar jobs. This can be used effectively when doing a parameter sweep, in which one key value (or several) in the code are changed from run to run. For example, in the eigenvalue calculation code considered in the previous section, one may want to study the effect of changing the matrix size. Alternatively, there may be cases in which the only change from one job to another is the name of an input file that contains the data to be processed. | |||
The key to using the job array feature is to set up the code to depend on a single integer value, the job array index, $SLURM_ARRAY_TASK_ID . When a job is run, the array index variable is replaced by Slurm with a specific value, taken from a corresponding --array argument on the sbatch command line. For example, if the job script shown below is called eig_parallel_array_test.slurm and you run the script as | |||
<pre> | |||
sbatch --array=1000,2000,3000 eig_parallel_array_test.slurm | |||
</pre> | |||
then three jobs will be run, with $SLURM_ARRAY_TASK_ID taking on the values 1000, 2000 and 3000 for the three cases, respectively. | |||
If you had a case in which input files to be processed were data_1.in, data_2.in, ... data_100.in, you could submit 100 jobs to each process one data file with | |||
<pre> | |||
sbatch --array=1-100 script_to_process_one_file.slurm | |||
</pre> | |||
where the job script refers to the files as data_${SLURM_ARRAY_TASK_ID}.in . (Putting the curly brackets around the variable name helps the bash shell from being confused with other text in the file name.) | |||
Testing shows that starting large numbers of jobs in a short period of time, such as can occur when job arrays are used and there are many free compute nodes, can lead to failures. This kind of error appears to be related to delayed file system access when MATLAB tries to write job information to a hidden subdirectory under your home directory. One way to avoid this problem is to introduce a delay between job submissions. This can be done by using a loop containing a sleep command: | |||
<syntaxhighlight lang=bash> | |||
for job in $(seq 1 100); do | |||
sbatch --array=$job script_to_process_one_file.slurm | |||
sleep 5 | |||
done | |||
</syntaxhighlight> | |||
The "sleep 5" causes a 5-second delay between each job submission. | |||
One other caveat regarding the use of job arrays is that the resources requested with the SBATCH directives are for a single instance of the calculation to be completed. So, if the resource requirements are expected to differ significantly from one job to another, you may have to write separate job scripts for each case or group of cases, rather than using a single job array script. | |||
Here is an example of a job script to calculate the maximum eigenvalue of a number of random matrices of a given size. | |||
<syntaxhighlight lang=bash> | |||
#!/bin/bash | |||
#SBATCH --time=01:00:00 # Adjust this to match the walltime of your job | |||
#SBATCH --nodes=1 # Always specify just one node. | |||
#SBATCH --ntasks=1 # Always specify just one task. | |||
#SBATCH --cpus-per-task=1 # For serial code, always specify just one CPU per task. | |||
#SBATCH --mem=4000m # Adjust to match total memory required, in MB. | |||
# Sample batch job script for running a MATLAB function to illustrate job-based parallel processing with job arrays | |||
# 2020-05-27 | |||
# The order of the matrix is specified using the job array index. | |||
# Specify the name of the main MATLAB function to be run. | |||
# This would normally be the same as the MATLAB source code file name without a .m suffix). | |||
MAIN="eig_parallel_test" | |||
# Define key parameters for the example calculation. | |||
NWORKERS=1 | |||
NTHREADS=1 | |||
MATRIX_SIZE=$SLURM_ARRAY_TASK_ID | |||
NMATRICES=10 | |||
RESULTS_FILE="maximum_eigenvalues_${MATRIX_SIZE}_${SLURM_JOB_ID}.txt" | |||
# Contruct a complete function call to pass to MATLAB | |||
# Note, string arguments should appear to MATLAB enclosed in single quotes | |||
ARGS="($NWORKERS,$NTHREADS,$MATRIX_SIZE,$NMATRICES,'$RESULTS_FILE')" | |||
MAIN_WITH_ARGS=${MAIN}${ARGS} | |||
echo "Calling MATLAB function: ${MAIN_WITH_ARGS}" | |||
echo "Starting run at $(date)" | |||
echo "Running on compute node $(hostname)" | |||
echo "Running from directory $(pwd)" | |||
# Choose a version of MATLAB by loading a module: | |||
module load matlab/r2020a | |||
echo "Using MATLAB version: $(which matlab)" | |||
matlab -batch "${MAIN_WITH_ARGS}" > ${MAIN}_${SLURM_JOB_ID}.out 2>&1 | |||
echo "Finished run at $(date)" | |||
</syntaxhighlight> | |||
= Standalone Applications = | = Standalone Applications = | ||
When running MATLAB code as described in the preceding sections, a connection to the campus MATLAB license server, checking out licenses for MATLAB and any specialized toolboxes needed, is made for each job that is submitted. Currently, with the University of Calgary's Total Academic Headcount license, there are sufficient license tokens to support thousands of simultaneous MATLAB sessions (although ARC usage policy and cluster load will limit individual users to smaller numbers of jobs). However, there may be times at which the license server is slow to respond when large numbers of requests are being handled, or the server may be unavailable temporarily due to network problems. MathWorks offers an alternative way of running MATLAB code that can avoid license server issues by compiling it into a standalone application. A license is required only during the compilation process and not when the code is run. This allows calculations to be run on ARC without concerns regarding the license server. The compiled code can also be run on compatible (64-bit Linux) hardware, not necessarily at the University of Calgary, such as on [https://docs.computecanada.ca/wiki/Getting_started Compute Canada (external link)] clusters. | |||
== Creating a standalone application == | |||
The MATLAB '''mcc''' command is used to compile source code (.m files) into a standalone executable. There are a couple of important considerations to keep in mind when creating an executable that can be run in a batch-oriented cluster environment. One is that there is no graphical display attached to your session and another is that the number of threads used by the standalone application has to be controlled. There is also an important difference in the way arguments of the main function are handled. | |||
Let's illustrate the process of creating a standalone application for the sawtooth.m code used previously. Unfortunately, if that code is compiled as it is, the resulting compiled application will fail to run properly. The reason is that the compiled code sees all the input arguments as strings instead of interpreting them as numbers. To work around this problem, use a MATLAB function, '''isdeployed''', to determine whether or not the code is being run as a standalone application. Here is a modified version of code, called sawtooth_standalone.m that can be successfully compiled and run as a standalone application. | |||
== | <syntaxhighlight lang=matlab> | ||
function sawtooth_standalone(nterms,nppcycle,ncycle,pngfilebase) | |||
% MATLAB file example to approximate a sawtooth | |||
% with a truncated Fourier expansion. | |||
% nterms = number of terms in expansion. | |||
% nppcycle = number of points per cycle. | |||
% ncycle = number of complete cycles to plot. | |||
% pngfilebase = base of file name for graph of results. | |||
% 2020-05-21 | |||
% Test to see if the code is running as a standalone application | |||
% If it is, convert the arguments intended to be numeric from | |||
% the input strings to numbers | |||
if isdeployed | |||
nterms=str2num(nterms) | |||
nppcycle=str2num(nppcycle) | |||
ncycle=str2num(ncycle) | |||
end | |||
np=nppcycle*ncycle; | |||
fourbypi=4.0/pi; | |||
y(1:np)=pi/2.0; | |||
x(1:np)=linspace(-pi*ncycle,pi*ncycle,np); | |||
for k=1:nterms | |||
twokm=2*k-1; | |||
y=y-fourbypi*cos(twokm*x)/twokm^2; | |||
end | |||
% Prepare output | |||
% Construct the output file name from the base file name and number of terms | |||
% Also append the Slurm JOBID to keep file names unique from run to run. | |||
job=getenv('SLURM_JOB_ID') | |||
pngfile=strcat(pngfilebase,'_',num2str(nterms),'_',job) | |||
disp(['Writing file: ',pngfile,'.png']) | |||
fig=figure; | |||
plot(x,y); | |||
print(fig,pngfile,'-dpng'); | |||
quit | |||
end | |||
</syntaxhighlight> | |||
Suppose that the sawtooth_standalone.m file is in a subdirectory src below your current working directory and that the compiled files are going to be written to a subdirectory called deploy. The following commands (at the Linux shell prompt) could be used to compile the code: | |||
<pre> | <pre> | ||
mkdir deploy | $ mkdir deploy | ||
cd src | $ cd src | ||
mcc -R -nodisplay \ | $ module load matlab/r2019b | ||
$ mcc -R -nodisplay \ | |||
-R -singleCompThread \ | -R -singleCompThread \ | ||
-m -v -w enable \ | -m -v -w enable \ | ||
-d ../deploy \ | -d ../deploy \ | ||
sawtooth_standalone.m | |||
</pre> | </pre> | ||
Note the option -singleCompThread has been included in order to limit the executable to just one computational thread. | Note the option -singleCompThread has been included in order to limit the executable to just one computational thread. | ||
In the deploy directory, an executable | In the deploy directory, an executable, sawtooth_standalone, will be created along with a script, run_sawtooth_standalone.sh. These two files should be copied to the target machine where the code is to be run. | ||
== Running a standalone application == | == Running a standalone application == | ||
After the standalone executable | After the standalone executable sawtooth_standalone and corresponding script run_sawtooth_standalone.sh have been transferred to a directory on the target system on which they will be run (whether to a different directory on ARC or to a completely different cluster), a batch job script needs to be created in that same directory. Here is an example batch job script, sawtooth_standalone.slurm, appropriate for the ARC cluster. | ||
<syntaxhighlight lang=bash> | |||
#!/bin/bash | |||
#SBATCH --time=03:00:00 # Adjust this to match the walltime of your job | |||
#SBATCH --nodes=1 # For serial code, always specify just one node. | |||
#SBATCH --ntasks=1 # For serial code, always specify just one task. | |||
#SBATCH --cpus-per-task=1 # For serial code, always specify just one CPU per task. | |||
#SBATCH --mem=4000m # Adjust to match total memory required, in MB. | |||
# Sample batch job script for running a compiled MATLAB function | |||
# 2020-05-21 | |||
# Specify the name of the compiled MATLAB standalone executable | |||
MAIN="sawtooth_standalone" | |||
# Define key parameters for the example calculation. | |||
NTERMS=100 | |||
NPPCYCLE=20 | |||
NCYCLE=3 | |||
PNGFILEBASE=$MAIN | |||
ARGS="$NTERMS $NPPCYCLE $NCYCLE $PNGFILEBASE" | |||
# Choose the MCR directory according to the compiler version used | |||
MCR=/global/software/matlab/mcr/v97 | |||
echo "Starting run at $(date)" | |||
echo "Running on compute node $(hostname)" | |||
echo "Running from directory $(pwd)" | |||
./run_${MAIN}.sh $MCR $ARGS > ${MAIN}_${SLURM_JOB_ID}.out 2>&1 | |||
echo "Finished run at $(date)" | |||
</syntaxhighlight> | |||
The | The job is then submitted with sbatch: | ||
<pre> | |||
$ sbatch sawtooth_standalone.slurm | |||
</pre> | |||
An important part of the above script is the location of the MATLAB Compiler Runtime (MCR) directory. This directory contains files necessary for the standalone application to run. The version of the MCR files specified ( | An important part of the above script is the definition of the variable '''MCR''', which defines the location of the MATLAB Compiler Runtime (MCR) directory. This directory contains files necessary for the standalone application to run. The version of the MCR files specified (v97 in the example, which corresponds to MATLAB R2019b) must match the version of MATLAB used to compile the code. | ||
A | A list of MATLAB distributions and the corresponding MCR versions is given on the [https://www.mathworks.com/products/compiler/matlab-runtime.html Mathworks web site (external link)]. Some versions installed on ARC are listed below, along with the corresponding installation directory to which the MCR variable should be set if running on ARC. (As of this writing on May 21, 2020, installation of release R2020a has not quite been finished, but, should be ready before the end of the month). If the MCR version you need does not appear in /global/software/matlab/mcr, write to support@hpc.ucalgary.ca to request that it be installed, or use a different version of MATLAB for your compilation. | ||
{| class="WG_Indent3" style="height: 150px; width: 576px" border="2" | {| class="WG_Indent3" style="height: 150px; width: 576px" border="2" | ||
| style="text-align: center" | '''MATLAB Release''' | | style="text-align: center" | '''MATLAB Release''' | ||
| style="text-align: center" | '''MCR Version''' | | style="text-align: center" | '''MCR Version''' | ||
| style="text-align: center" | '''MCR directory | | style="text-align: center" | '''MCR directory''' | ||
|- | |- | ||
| style="text-align: center" | | | style="text-align: center" | R2017a | ||
| style="text-align: center" | | | style="text-align: center" | 9.2 | ||
| class="WG_Inline_Code" style="text-align: center" | /global/software/matlab/mcr/v92 | |||
| class="WG_Inline_Code" style="text-align: center" | /global/software/matlab/mcr/ | |||
|- | |- | ||
| style="text-align: center" | | | style="text-align: center" | R2017b | ||
| style="text-align: center" | | | style="text-align: center" | 9.3 | ||
| | | class="WG_Inline_Code" style="text-align: center" | /global/software/matlab/mcr/v93 | ||
|- | |- | ||
| style="text-align: center" | | | style="text-align: center" | R2018a | ||
| style="text-align: center" | 4 | | style="text-align: center" | 9.4 | ||
| class="WG_Inline_Code" style="text-align: center" | /global/software/matlab/mcr/v94 | |||
| class="WG_Inline_Code" style="text-align: center" | /global/software/matlab/mcr/ | |||
|- | |- | ||
| style="text-align: center" | | | style="text-align: center" | R2019b | ||
| style="text-align: center" | | | style="text-align: center" | 9.7 | ||
| class="WG_Inline_Code" style="text-align: center" | /global/software/matlab/mcr/v97 | |||
| class="WG_Inline_Code" style="text-align: center" | /global/software/matlab/mcr/ | |||
|- | |- | ||
| style="text-align: center" | | | style="text-align: center" | R2020a | ||
| style="text-align: center" | | | style="text-align: center" | 9.8 | ||
| class="WG_Inline_Code" style="text-align: center" | /global/software/matlab/mcr/v98 | |||
| class="WG_Inline_Code" style="text-align: center" | /global/software/matlab/mcr/ | |||
|} | |} | ||
=== Possible issues when running a standalone application === | |||
When MATLAB code is executed as a standalone application, a collection of shared objects (*.so files) that it depends on are unpacked into a cache directory. These are used to supplement the compiled code at run time. If you are running only a single matlab standalone process at a time, this won't cause any problems. However, when multiple jobs run at the same time in a cluster environment, it has been observed that use of a single MCR Cache can become an issue and lead to job failure with warning messages of the form "Could not access the MATLAB Runtime component cache. Details: fl:filesystem:SystemError; component cache root:; componentname: ..." | |||
By default, the MCR Cache is created in ~/.mcrCacheX.X but you can use the MCR_CACHE_ROOT environment variable to create the cache for a given process in a different location. If cache locking is the problem causing these failures, this can be resolved by writing the cache to a unique location for each job. One approach is to use the pre-existing /scratch directories for each job. This can be done by adding a couple lines to the beginning and end of your script: | |||
= | <syntaxhighlight lang=bash> | ||
#!/bin/bash | |||
...original request... | |||
export MCR_CACHE_ROOT=/scratch/$SLURM_JOB_ID/.mcrCache | |||
mkdir -p /scratch/$SLURM_JOB_ID/.mcrCache | |||
...original script... | |||
rm -R /scratch/$SLURM_JOB_ID/.mcrCache | |||
</syntaxhighlight> | |||
This is likely to resolve your problem. However, if you have issues with file system performance for reading and writing files, you may need to use /tmp instead of /scratch. It is important to clean up after yourself in both directories as there are quotas on both, but it is much more important in /tmp as the quota there is comparatively quite small. | |||
= Support = | |||
Please send any questions regarding using MATLAB on ARC to support@hpc.ucalgary.ca. | |||
= Links = | |||
[[ARC Software pages]] | |||
[[Category:MATLAB]] | |||
[[Category:Software]] | |||
[[Category:ARC]] | |||
{{Navbox ARC}} |
Latest revision as of 19:36, 18 October 2023
Introduction
MATLAB is a general-purpose high-level programming package for numerical work such as linear algebra, signal processing and other calculations involving matrices or vectors of data. Visualization tools are also included for presentation of results. The basic MATLAB package is extended through add-on components including SIMULINK, and the Image Processing, Optimization, Neural Network, Signal Processing, Statistics and Wavelet Toolboxes, among others.
The main purpose of this page is to show how to use MATLAB on the University of Calgary ARC (Advanced Research Computing) cluster. It is presumed that you already have an account an ARC and have read the material on reserving resources and running jobs with the Slurm job management system.
Ways to run MATLAB - License considerations
At the University of Calgary, Information Technologies has purchased a MATLAB Total Academic Headcount license that allows installation and use of MATLAB on central clusters, such as ARC, as well as on personal workstations throughout the University. Potentially thousands of instances of MATLAB can be run simultaneously, each checking out a license from a central license server. An alternative is to compile MATLAB code into a standalone application. When such an application is run, it does not need to contact the server for a license. This allows researchers to run their calculations on compatible hardware, not necessarily at the University of Calgary, such as on Compute Canada clusters (external link).
For information about installing MATLAB on your own computer, see the Information Technologies Knowledge Base article on MATLAB.
Running MATLAB from command line
Before you begin using MATLAB on a cluster, you have to know how to run your MATLAB programs (codes) from the command line.
The MATLAB code you want to run, say my_code.m
, must define a function with the same name and should end with the quit; end
commands
function my_code(arg1, arg2, arg3)
.....
.....
quit
end
to make sure that MATLAB exits at the end of the computation.
You can run such code from the command line with a command:
$ matlab -batch "my_code(1, 2, 3)"
Note, that there is no extension .m
(ending).
If the function does not take any arguments, then this command will work too:
$ matlab -batch my_code
Running MATLAB on the ARC cluster
Although it is possible to run MATLAB interactively, the expectation is that most calculations with MATLAB will be completed by submitting a batch job script to the Slurm job scheduler with the sbatch command.
For many researchers, the main reason for using ARC for MATLAB-based calculations is to be able to run many instances at the same time. It is recommended in such cases that any parallel processing features be removed from the code and each instance of MATLAB be run on a single CPU core. It is also possible to run MATLAB on multiple cores in an attempt to speed up individual instances, but, this generally results in a less efficient use of the cluster hardware. In the sections that follow, serial and then parallel processing examples are shown.
Serial MATLAB example
For the purposes of illustration, suppose the following serial MATLAB code, in a file sawtooth.m, is to be run. If your code does not already have them, add a function statement at the beginning and matching end statement at the end as shown in the example. Other features of this example include calling a function with both numerical and string arguments, incorporating a Slurm environment variable into the MATLAB code and producing graphical output in a non-interactive environment.
function sawtooth(nterms,nppcycle,ncycle,pngfilebase)
% MATLAB file example to approximate a sawtooth
% with a truncated Fourier expansion.
% nterms = number of terms in expansion.
% nppcycle = number of points per cycle.
% ncycle = number of complete cycles to plot.
% pngfilebase = base of file name for graph of results.
% 2020-05-14
np=nppcycle*ncycle;
fourbypi=4.0/pi;
y(1:np)=pi/2.0;
x(1:np)=linspace(-pi*ncycle,pi*ncycle,np);
for k=1:nterms
twokm=2*k-1;
y=y-fourbypi*cos(twokm*x)/twokm^2;
end
% Prepare output
% Construct the output file name from the base file name and number of terms
% Also append the Slurm JOBID to keep file names unique from run to run.
job=getenv('SLURM_JOB_ID')
pngfile=strcat(pngfilebase,'_',num2str(nterms),'_',job)
disp(['Writing file: ',pngfile,'.png'])
fig=figure;
plot(x,y);
print(fig,pngfile,'-dpng');
quit
end
In preparation to run the sawtooth.m
code, create a batch job script, sawtooth.slurm
of the form:
#!/bin/bash
# ==============================================================================
#SBATCH --time=03:00:00 # Adjust this to match the walltime of your job
#SBATCH --nodes=1 # For serial code, always specify just one node.
#SBATCH --ntasks=1 # For serial code, always specify just one task.
#SBATCH --cpus-per-task=1 # For serial code, always specify just one CPU per task.
#SBATCH --mem=4000m # Adjust to match total memory required, in MB.
# ==============================================================================
# Sample batch job script for running a MATLAB function with both numerical and string arguments
module load matlab/r2019b
# Use -singleCompThread below for serial MATLAB code:
# Function call: sawtooth(NTerms, NPPCycles, NCycle, PNGFileBase)
matlab -singleCompThread -batch "sawtooth(100,20,3,'sawtooth')"
# ==============================================================================
Note that the above script uses the -batch option on the matlab command line. The MathWorks web page on running MATLAB on Linux (external link) starting with Release 2019a of MATLAB, recommends using the -batch option for non-interactive use instead of the similar -r option that is recommended in interactive sessions.
To submit the job to be executed, run:
sbatch sawtooth.slurm
The job should produce three output files: Slurm script output, MATLAB command output and a PNG file, all tagged with the Slurm Job ID.
Parallel MATLAB examples
MATLAB provides several ways of speeding up calculations through parallel processing. These include relying on internal parallelization in which multiple threads are used or by using explicit language features, such as parfor, to start up multiple workers on a compute node. Examples of both approaches are shown below. Using multiple compute nodes for a single MATLAB calculation, which depends on the MATLAB Parallel Server product, is not considered here as there has not been sufficient demand to configure that software on ARC.
For many researchers, submitting many independent serial jobs is a better approach to efficient parallelization than using parallel programming in MATLAB itself. An example of job-based parallelism is also shown below.
Thread-based parallel processing
First consider an example using multiple cores with MATLAB's built-in thread-based parallelization.
Suppose the following code to calculate eigenvalues of a number of random matrices is in a file eig_thread_test.m . Note the use of the maxNumCompThreads function to control the number of threads (one thread per CPU core). For some years now, MathWorks has marked that function as deprecated, but, it still provides a useful limit to ensure that MATLAB doesn't use more cores than assigned by Slurm.
function eig_thread_test(nthreads,matrix_size,nmatrices,results_file)
% Calculate the absolute value of the maximum eigenvalue for each of a number of matrices
% possibly using multiple threads.
% nthreads = number of computational threads to use.
% matrix_size = order of two-dimensional random matrix.
% nmatrices = number of matrices to process.
% results_file = name of file in which to save the maximum eigenvalues
% 2020-05-25
matlab_ncores=feature('numcores')
slurm_ncores_per_task=str2num(getenv('SLURM_CPUS_PER_TASK'))
if(isempty(slurm_ncores_per_task))
slurm_ncores_per_task=1;
disp('SLURM_CPUS_PER_TASK not set')
end
% Set number of computational threads to the minimum of matlab_ncores and slurm_ncores_per_task
% Note Mathworks warns that the maxNumCompThreads function will
% be removed in future versions of MATLAB.
% Use only thread-based parallel processing
intial_matlab_max_ncores = maxNumCompThreads(min([nthreads,slurm_ncores_per_task]));
disp(['Using a maximum of ',num2str(maxNumCompThreads()),' computational threads.'])
tic
for i = 1:nmatrices
e=eig(rand(matrix_size));
eigenvalues(i) = max(abs(e));
end
toc
save(results_file,'eigenvalues','-ascii')
quit
end
Here is a job script, eig_thread_test.slurm that can be used to run the eig_thread_test.m code. The number of threads used for the calculation is controlled by specifying the --cpus-per-task parameter that Slurm uses to control the number of CPU cores assigned to the job.
#!/bin/bash
#SBATCH --time=01:00:00 # Adjust this to match the walltime of your job
#SBATCH --nodes=1 # Always specify just one node.
#SBATCH --ntasks=1 # Specify just one task.
#SBATCH --cpus-per-task=8 # The number of threads to use
#SBATCH --mem=4000m # Adjust to match total memory required, in MB.
# Sample batch job script for running a MATLAB function to test thread-based parallel processing features.
# 2020-05-25
# Specify the name of the main MATLAB function to be run.
# This would normally be the same as the MATLAB source code file name without a .m suffix).
MAIN="eig_thread_test"
# Define key parameters for the example calculation.
NTHREADS=${SLURM_CPUS_PER_TASK}
MATRIX_SIZE=10000
NMATRICES=10
RESULTS_FILE="maximum_eigenvalues_${SLURM_JOB_ID}.txt"
# Contruct a complete function call to pass to MATLAB
# Note, string arguments should appear to MATLAB enclosed in single quotes
ARGS="($NTHREADS,$MATRIX_SIZE,$NMATRICES,'$RESULTS_FILE')"
MAIN_WITH_ARGS=${MAIN}${ARGS}
echo "Calling MATLAB function: ${MAIN_WITH_ARGS}"
echo "Starting run at $(date)"
echo "Running on compute node $(hostname)"
echo "Running from directory $(pwd)"
# Choose a version of MATLAB by loading a module:
module load matlab/r2020a
echo "Using MATLAB version: $(which matlab)"
matlab -batch "${MAIN_WITH_ARGS}" > ${MAIN}_${SLURM_JOB_ID}.out 2>&1
echo "Finished run at $(date)"
The above job can be submitted with
sbatch eig_thread_test.slurm
If assigned to one of the modern partitions (as opposed to the older single, lattice or parallel partitions) the job took about 16 minutes, about 4 times faster than a comparable serial job. Using 8 cores to obtain just a factor of four speed-up is not an efficient use of ARC, but, might be justified in some cases. In some cases, using multiple workers (as discussed in the next section) may be faster than using the same number of cores with thread-based parallelization.
Explicit parallel processing using a pool of workers
Now consider the more complicated case of creating a pool of workers and using a parfor loop to explicitly parallelize a section of code. Each worker may use one or more cores through MATLAB's internal thread-based parallelization, as in the preceding example. Suppose the following code is in a file eig_parallel_test.m.
function eig_parallel_test(nworkers,nthreads,matrix_size,nmatrices,results_file)
% Calculate the absolute value of the maximum eigenvalue for each of a number of matrices
% possibly using multiple threads and multiple MATLAB workers.
% nworkers = number of MATLAB workers to use.
% nthreads = number of threads per worker.
% matrix_size = order of two-dimensional random matrix.
% nmatrices = number of matrices to process.
% results_file = name of file in which to save the maximum eigenvalues
% 2020-05-25
matlab_ncores=feature('numcores')
slurm_ncores_per_task=str2num(getenv('SLURM_CPUS_PER_TASK'))
if(isempty(slurm_ncores_per_task))
slurm_ncores_per_task=1;
disp('SLURM_CPUS_PER_TASK not set')
end
% Set number of computational threads to the minimum of matlab_ncores and slurm_ncores
% Note Mathworks warns that the maxNumCompThreads function will
% be removed in future versions of MATLAB.
% Testing based on remarks at
% https://www.mathworks.com/matlabcentral/answers/158192-maxnumcompthreads-hyperthreading-and-parpool
% shows the maxNumCompThreads has to be called inside the parfor loop.
tic
if ( nworkers > 1 )
% Process with multiple workers
% Check on properties of the local MATLAB cluster.
% One can set properties such as c.NumThreads and c.NumWorkers
parallel.defaultClusterProfile('local')
c = parcluster()
c.NumThreads=nthreads
c.NumWorkers=nworkers
% Create a pool of workers with the current cluster settings.
% Note, testing without the nworkers argument showed a limit of 12 workers even if c.NumWorkers is defined.
parpool(c,nworkers)
ticBytes(gcp);
parfor i = 1:nmatrices
e=eig(rand(matrix_size));
eigenvalues(i) = max(abs(e));
end
tocBytes(gcp)
% Close down the pool.
delete(gcp('nocreate'));
else
% Use only thread-based parallel processing
intial_matlab_max_ncores = maxNumCompThreads(min([nthreads,slurm_ncores_per_task]))
for i = 1:nmatrices
e=eig(rand(matrix_size));
eigenvalues(i) = max(abs(e));
end
end % nworkers test
toc
save(results_file,'eigenvalues','-ascii')
quit
end
Of particular note in the preceding example is the section of lines (copied below) that creates a cluster object, c, and modifies the number of threads associated with this object (c.NumThreads=nthreads). In a similar way, one can modify the number of workers (c.NumWorkers=nworkers). Testing showed that if one then used the MATLAB gcp or parpool commands without arguments to create a pool of workers, at most 12 workers were created. However, it was found that by using parpool(c,nworkers), the requested number of workers would be started, even if nworkers > 12.
parallel.defaultClusterProfile('local')
c = parcluster()
c.NumThreads=nthreads
c.NumWorkers=nworkers
parpool(c,nworkers)
An example Slurm batch job script, eig_parallel_test.slurm, used to test the above code was:
#!/bin/bash
#SBATCH --time=03:00:00 # Adjust this to match the walltime of your job
#SBATCH --nodes=1 # Always specify just one node.
#SBATCH --ntasks=1 # Always specify just one task.
#SBATCH --cpus-per-task=8 # Choose --cpus-per-task to match the number of workers * threads per worker
#SBATCH --mem=10000m # Adjust to match total memory required, in MB.
# Sample batch job script for running a MATLAB function to test parallel processing features
# 2020-05-25
# Specify the name of the main MATLAB function to be run.
# This would normally be the same as the MATLAB source code file name without a .m suffix).
MAIN="eig_parallel_test"
# Define key parameters for the example calculation.
NWORKERS=${SLURM_NTASKS}
NTHREADS=${SLURM_CPUS_PER_TASK}
MATRIX_SIZE=10000
NMATRICES=10
RESULTS_FILE="maximum_eigenvalues_${SLURM_JOB_ID}.txt"
# Contruct a complete function call to pass to MATLAB
# Note, string arguments should appear to MATLAB enclosed in single quotes
ARGS="($NWORKERS,$NTHREADS,$MATRIX_SIZE,$NMATRICES,'$RESULTS_FILE')"
MAIN_WITH_ARGS=${MAIN}${ARGS}
echo "Calling MATLAB function: ${MAIN_WITH_ARGS}"
echo "Starting run at $(date)"
echo "Running on compute node $(hostname)"
echo "Running from directory $(pwd)"
# Choose a version of MATLAB by loading a module:
module load matlab/r2020a
echo "Using MATLAB version: $(which matlab)"
matlab -batch "${MAIN_WITH_ARGS}" > ${MAIN}_${SLURM_JOB_ID}.out 2>&1
echo "Finished run at $(date)"
Note that the Slurm SBATCH parameter --cpus-per-task, the total number of cores to use, should be the product of the number of workers and the threads per worker. (It might be argued that one more core should be requested beyond the product of workers and threads, to use for the main MATLAB process, but, for a fully parallelized code, the workers do the great bulk of the calculation and the main MATLAB process uses relatively little CPU time.)
Variations on the above code were tested with many combinations of workers and threads per worker. It was found that if many jobs were started in close succession that some of the jobs failed to start properly. The problem can be avoided by introducing a short delay between job submissions, as illustrated in the section on job arrays below.
Also note that in the above example, the number of cores requested is not well matched to the number of workers. If there are 8 workers processing 10 matrices and each worker gets assigned one matrix, there are two left over. The total time for 10 matrices (or for any number of matrices from 9 to 16) would be about double that for 8 matrices. For example, in testing the above code, the time for processing 10 matrices with 8 workers with just one computational thread was 1050 seconds, whereas processing just 8 matrices with 8 workers took only 560 seconds.
Job-based parallel processing
Slurm provides a feature called job arrays that can sometimes be conveniently used to submit a large number of similar jobs. This can be used effectively when doing a parameter sweep, in which one key value (or several) in the code are changed from run to run. For example, in the eigenvalue calculation code considered in the previous section, one may want to study the effect of changing the matrix size. Alternatively, there may be cases in which the only change from one job to another is the name of an input file that contains the data to be processed.
The key to using the job array feature is to set up the code to depend on a single integer value, the job array index, $SLURM_ARRAY_TASK_ID . When a job is run, the array index variable is replaced by Slurm with a specific value, taken from a corresponding --array argument on the sbatch command line. For example, if the job script shown below is called eig_parallel_array_test.slurm and you run the script as
sbatch --array=1000,2000,3000 eig_parallel_array_test.slurm
then three jobs will be run, with $SLURM_ARRAY_TASK_ID taking on the values 1000, 2000 and 3000 for the three cases, respectively.
If you had a case in which input files to be processed were data_1.in, data_2.in, ... data_100.in, you could submit 100 jobs to each process one data file with
sbatch --array=1-100 script_to_process_one_file.slurm
where the job script refers to the files as data_${SLURM_ARRAY_TASK_ID}.in . (Putting the curly brackets around the variable name helps the bash shell from being confused with other text in the file name.)
Testing shows that starting large numbers of jobs in a short period of time, such as can occur when job arrays are used and there are many free compute nodes, can lead to failures. This kind of error appears to be related to delayed file system access when MATLAB tries to write job information to a hidden subdirectory under your home directory. One way to avoid this problem is to introduce a delay between job submissions. This can be done by using a loop containing a sleep command:
for job in $(seq 1 100); do
sbatch --array=$job script_to_process_one_file.slurm
sleep 5
done
The "sleep 5" causes a 5-second delay between each job submission.
One other caveat regarding the use of job arrays is that the resources requested with the SBATCH directives are for a single instance of the calculation to be completed. So, if the resource requirements are expected to differ significantly from one job to another, you may have to write separate job scripts for each case or group of cases, rather than using a single job array script.
Here is an example of a job script to calculate the maximum eigenvalue of a number of random matrices of a given size.
#!/bin/bash
#SBATCH --time=01:00:00 # Adjust this to match the walltime of your job
#SBATCH --nodes=1 # Always specify just one node.
#SBATCH --ntasks=1 # Always specify just one task.
#SBATCH --cpus-per-task=1 # For serial code, always specify just one CPU per task.
#SBATCH --mem=4000m # Adjust to match total memory required, in MB.
# Sample batch job script for running a MATLAB function to illustrate job-based parallel processing with job arrays
# 2020-05-27
# The order of the matrix is specified using the job array index.
# Specify the name of the main MATLAB function to be run.
# This would normally be the same as the MATLAB source code file name without a .m suffix).
MAIN="eig_parallel_test"
# Define key parameters for the example calculation.
NWORKERS=1
NTHREADS=1
MATRIX_SIZE=$SLURM_ARRAY_TASK_ID
NMATRICES=10
RESULTS_FILE="maximum_eigenvalues_${MATRIX_SIZE}_${SLURM_JOB_ID}.txt"
# Contruct a complete function call to pass to MATLAB
# Note, string arguments should appear to MATLAB enclosed in single quotes
ARGS="($NWORKERS,$NTHREADS,$MATRIX_SIZE,$NMATRICES,'$RESULTS_FILE')"
MAIN_WITH_ARGS=${MAIN}${ARGS}
echo "Calling MATLAB function: ${MAIN_WITH_ARGS}"
echo "Starting run at $(date)"
echo "Running on compute node $(hostname)"
echo "Running from directory $(pwd)"
# Choose a version of MATLAB by loading a module:
module load matlab/r2020a
echo "Using MATLAB version: $(which matlab)"
matlab -batch "${MAIN_WITH_ARGS}" > ${MAIN}_${SLURM_JOB_ID}.out 2>&1
echo "Finished run at $(date)"
Standalone Applications
When running MATLAB code as described in the preceding sections, a connection to the campus MATLAB license server, checking out licenses for MATLAB and any specialized toolboxes needed, is made for each job that is submitted. Currently, with the University of Calgary's Total Academic Headcount license, there are sufficient license tokens to support thousands of simultaneous MATLAB sessions (although ARC usage policy and cluster load will limit individual users to smaller numbers of jobs). However, there may be times at which the license server is slow to respond when large numbers of requests are being handled, or the server may be unavailable temporarily due to network problems. MathWorks offers an alternative way of running MATLAB code that can avoid license server issues by compiling it into a standalone application. A license is required only during the compilation process and not when the code is run. This allows calculations to be run on ARC without concerns regarding the license server. The compiled code can also be run on compatible (64-bit Linux) hardware, not necessarily at the University of Calgary, such as on Compute Canada (external link) clusters.
Creating a standalone application
The MATLAB mcc command is used to compile source code (.m files) into a standalone executable. There are a couple of important considerations to keep in mind when creating an executable that can be run in a batch-oriented cluster environment. One is that there is no graphical display attached to your session and another is that the number of threads used by the standalone application has to be controlled. There is also an important difference in the way arguments of the main function are handled.
Let's illustrate the process of creating a standalone application for the sawtooth.m code used previously. Unfortunately, if that code is compiled as it is, the resulting compiled application will fail to run properly. The reason is that the compiled code sees all the input arguments as strings instead of interpreting them as numbers. To work around this problem, use a MATLAB function, isdeployed, to determine whether or not the code is being run as a standalone application. Here is a modified version of code, called sawtooth_standalone.m that can be successfully compiled and run as a standalone application.
function sawtooth_standalone(nterms,nppcycle,ncycle,pngfilebase)
% MATLAB file example to approximate a sawtooth
% with a truncated Fourier expansion.
% nterms = number of terms in expansion.
% nppcycle = number of points per cycle.
% ncycle = number of complete cycles to plot.
% pngfilebase = base of file name for graph of results.
% 2020-05-21
% Test to see if the code is running as a standalone application
% If it is, convert the arguments intended to be numeric from
% the input strings to numbers
if isdeployed
nterms=str2num(nterms)
nppcycle=str2num(nppcycle)
ncycle=str2num(ncycle)
end
np=nppcycle*ncycle;
fourbypi=4.0/pi;
y(1:np)=pi/2.0;
x(1:np)=linspace(-pi*ncycle,pi*ncycle,np);
for k=1:nterms
twokm=2*k-1;
y=y-fourbypi*cos(twokm*x)/twokm^2;
end
% Prepare output
% Construct the output file name from the base file name and number of terms
% Also append the Slurm JOBID to keep file names unique from run to run.
job=getenv('SLURM_JOB_ID')
pngfile=strcat(pngfilebase,'_',num2str(nterms),'_',job)
disp(['Writing file: ',pngfile,'.png'])
fig=figure;
plot(x,y);
print(fig,pngfile,'-dpng');
quit
end
Suppose that the sawtooth_standalone.m file is in a subdirectory src below your current working directory and that the compiled files are going to be written to a subdirectory called deploy. The following commands (at the Linux shell prompt) could be used to compile the code:
$ mkdir deploy $ cd src $ module load matlab/r2019b $ mcc -R -nodisplay \ -R -singleCompThread \ -m -v -w enable \ -d ../deploy \ sawtooth_standalone.m
Note the option -singleCompThread has been included in order to limit the executable to just one computational thread.
In the deploy directory, an executable, sawtooth_standalone, will be created along with a script, run_sawtooth_standalone.sh. These two files should be copied to the target machine where the code is to be run.
Running a standalone application
After the standalone executable sawtooth_standalone and corresponding script run_sawtooth_standalone.sh have been transferred to a directory on the target system on which they will be run (whether to a different directory on ARC or to a completely different cluster), a batch job script needs to be created in that same directory. Here is an example batch job script, sawtooth_standalone.slurm, appropriate for the ARC cluster.
#!/bin/bash
#SBATCH --time=03:00:00 # Adjust this to match the walltime of your job
#SBATCH --nodes=1 # For serial code, always specify just one node.
#SBATCH --ntasks=1 # For serial code, always specify just one task.
#SBATCH --cpus-per-task=1 # For serial code, always specify just one CPU per task.
#SBATCH --mem=4000m # Adjust to match total memory required, in MB.
# Sample batch job script for running a compiled MATLAB function
# 2020-05-21
# Specify the name of the compiled MATLAB standalone executable
MAIN="sawtooth_standalone"
# Define key parameters for the example calculation.
NTERMS=100
NPPCYCLE=20
NCYCLE=3
PNGFILEBASE=$MAIN
ARGS="$NTERMS $NPPCYCLE $NCYCLE $PNGFILEBASE"
# Choose the MCR directory according to the compiler version used
MCR=/global/software/matlab/mcr/v97
echo "Starting run at $(date)"
echo "Running on compute node $(hostname)"
echo "Running from directory $(pwd)"
./run_${MAIN}.sh $MCR $ARGS > ${MAIN}_${SLURM_JOB_ID}.out 2>&1
echo "Finished run at $(date)"
The job is then submitted with sbatch:
$ sbatch sawtooth_standalone.slurm
An important part of the above script is the definition of the variable MCR, which defines the location of the MATLAB Compiler Runtime (MCR) directory. This directory contains files necessary for the standalone application to run. The version of the MCR files specified (v97 in the example, which corresponds to MATLAB R2019b) must match the version of MATLAB used to compile the code.
A list of MATLAB distributions and the corresponding MCR versions is given on the Mathworks web site (external link). Some versions installed on ARC are listed below, along with the corresponding installation directory to which the MCR variable should be set if running on ARC. (As of this writing on May 21, 2020, installation of release R2020a has not quite been finished, but, should be ready before the end of the month). If the MCR version you need does not appear in /global/software/matlab/mcr, write to support@hpc.ucalgary.ca to request that it be installed, or use a different version of MATLAB for your compilation.
MATLAB Release | MCR Version | MCR directory |
R2017a | 9.2 | /global/software/matlab/mcr/v92 |
R2017b | 9.3 | /global/software/matlab/mcr/v93 |
R2018a | 9.4 | /global/software/matlab/mcr/v94 |
R2019b | 9.7 | /global/software/matlab/mcr/v97 |
R2020a | 9.8 | /global/software/matlab/mcr/v98 |
Possible issues when running a standalone application
When MATLAB code is executed as a standalone application, a collection of shared objects (*.so files) that it depends on are unpacked into a cache directory. These are used to supplement the compiled code at run time. If you are running only a single matlab standalone process at a time, this won't cause any problems. However, when multiple jobs run at the same time in a cluster environment, it has been observed that use of a single MCR Cache can become an issue and lead to job failure with warning messages of the form "Could not access the MATLAB Runtime component cache. Details: fl:filesystem:SystemError; component cache root:; componentname: ..."
By default, the MCR Cache is created in ~/.mcrCacheX.X but you can use the MCR_CACHE_ROOT environment variable to create the cache for a given process in a different location. If cache locking is the problem causing these failures, this can be resolved by writing the cache to a unique location for each job. One approach is to use the pre-existing /scratch directories for each job. This can be done by adding a couple lines to the beginning and end of your script:
#!/bin/bash
...original request...
export MCR_CACHE_ROOT=/scratch/$SLURM_JOB_ID/.mcrCache
mkdir -p /scratch/$SLURM_JOB_ID/.mcrCache
...original script...
rm -R /scratch/$SLURM_JOB_ID/.mcrCache
This is likely to resolve your problem. However, if you have issues with file system performance for reading and writing files, you may need to use /tmp instead of /scratch. It is important to clean up after yourself in both directories as there are quotas on both, but it is much more important in /tmp as the quota there is comparatively quite small.
Support
Please send any questions regarding using MATLAB on ARC to support@hpc.ucalgary.ca.