Managing software on ARC: Difference between revisions

From RCSWiki
Jump to navigation Jump to search
Line 97: Line 97:


* An archive containing '''pre-compiled binary files''' of the program (usually <code>.zip</code> or <code>.tar.gz</code> files).
* An archive containing '''pre-compiled binary files''' of the program (usually <code>.zip</code> or <code>.tar.gz</code> files).
: You can unpack the archive an place the files to a directory or your choice.
: You can unpack the archive and place the files to a directory or your choice.




* An archive containing '''source codes on the program''' (usually <code>.tar.gz</code> or <code>.tar.bz2</code> files).  
* An archive containing '''source codes on the program''' (usually <code>.tar.gz</code> or <code>.tar.bz2</code> files).  
: These need to be compiled before you can use them.
: These need to be compiled before you can use them. Typically, the binary files have to be installed after the compilation.  





Revision as of 16:46, 11 January 2022

Overview

In addition to basic software distributed with most Linux systems, additional application packages and libraries have been installed for use on ARC under /global/software. Also see the module avail command below for a list of some of the installed software packages. Write to support@hpc.ucalgary.ca if you need additional software installed.

Environment modules

To facilitate the use of some of the software on the ARC cluster one can load a corresponding environmental module file, which may add an installation directory to the PATH variable used to locate the executable files, or help the software find libraries upon which it depends. An overview of modules on WestGrid is largely applicable to ARC.

To list the software for which an environment module file has been created, use the command:

$ module avail

Then, to set up your environment to use a particular package, use the module load command. For example, to load a module for Python use:

$ module load python/anaconda-3.6-5.1.0

If you need to undo the changes made by loading the module, you can use the module unload command:

$ module unload python/anaconda-3.6-5.1.0

To see currently loaded modules, type:

$ module list

Unlike some clusters, there are no modules loaded by default. So, for example, to use Intel compilers, or to use Open MPI parallel programming, you must load an appropriate module.

Installing Software in User's Home Directory

Background

If you are a user on the ARC cluster, you can install software yourself into your own home directory. You should be able to follow the specific software installation instructions which you can find on the software distribution site or inside the source / distribution archive. Typically, most manuals and guides assume that you have admin privileges on the system you are installing software on. This is not the case on ARC, and this could be the main source of difficulties with installations. You have to adjust the instructions you follow to reflect the fact that you are installing into your home directory and not into the standard system location. The system locations requires admin privileges to write to.

ARC is a multi-user system and on such a system users cannot do anything that affects other users. Using the sudo command, for example, implies that you what to do something that affect other users, this is why you cannot use it on ARC.


Using a package manager (apt, yum, etc.) does require changing common system directories, which would affect all users on the system. At the same time, the package manager would install software onto the login node only. Login node is not supposed to run your computations, the compute nodes are, but the package manager that you run on the login node cannot install software to the compute nodes, which are different computers.


Thus, software that you want to use on ARC must be installed onto a shared file system that is accessible by all the nodes in the cluster. If we (analysts) install software centrally, we install it into /global/software shared directory, if a user wants to install a software package and manage it on his/her own, than it has to be installed into the user's home directory, that is /home/$HOME. In such cases package managers cannot be used and the software often has to be compiled on ARC and the desired installation location has to be specified during the compilation process. If there is a dependency, library or another software package, that has to be present in the system, then the dependency has to be installed the same way prior to the compilation.

Planning

Before installing a software package you should think about the structure for your future installs. There are many ways to do that, but here is a simple and tried way to organize software in your home directory:


  • You software will be stored in the software subdirectory in your home, you can refer to it as $HOME/software path on the command line.


  • The directories for a software package of some specific version can be installed in a sub-directory named name-version.
For example, if you want to install the version 3.6.2 of GNU R, it can go into the $HOME/software/r-3.6.2 sub-directory.


  • The distribution files and archives can be downloaded to the $HOME/software/src/software_name sub-directory.
For the R example above, it would be $HOME/software/src/r/ directory.


This is how you can setup this directory structure:

$ cd

$ mkdir $HOME/software

$ mkdir $HOME/software/src

You can check if you have it:

$ ls -l
....
drwxr-xr-x 4 username username  4096 Jun  8  2021 software
....

Getting Software

To install the software package you want you have to obtain a distribution source for it. Depending on the software and/or your choice you may get either:


  • An archive containing pre-compiled binary files of the program (usually .zip or .tar.gz files).
You can unpack the archive and place the files to a directory or your choice.


  • An archive containing source codes on the program (usually .tar.gz or .tar.bz2 files).
These need to be compiled before you can use them. Typically, the binary files have to be installed after the compilation.


  • A binary installer, program that can be run and it will install a copy of pre-compiled software for you (can be .sh file, or no extension).
You have to run it to initiate the installation process.