Conda on ARC
Background
Conda is a tool for managing and deploying applications, environments and packages.
Some points on using Conda
- Conda is a package manager and installer.
- It has to be installed once, and then it can be used for managing the software the user wants.
- Python is not a part of Conda.
- Conda uses environments to separate software installations to prevent possible conflicts and incompatibilities.
- Different software has to be installed into different environment.
- Before one can use a package installed into an environment, the environment has to be activated.
- Only the software installed into that environment will be available after the activation.
- If a different environment needs to be activated, please make sure that the current environment is deactivated.
- If multiple environments need a specific module or library, this module or library has to be installed multiple times.
- Each environment is independent and separate from other environments, thus a module installed in one environment will not be available in a different environment.
- Environments can be organized based on activities, rather than software.
- If you are sure that several software packages you want do not interfere with each other, you can have them installed in the same environment, if this fits your usage pattern.
- For example, you may want to have both Python and R installed in the same environment, if you typically use them both in the same activity.
Brief help message
$ conda --help usage: conda [-h] [-V] command ... conda is a tool for managing and deploying applications, environments and packages. Options: positional arguments: command clean Remove unused packages and caches. config Modify configuration values in .condarc. This is modeled after the git config command. Writes to the user .condarc file (/home/drozmano/.condarc) by default. create Create a new conda environment from a list of specified packages. help Displays a list of available conda commands and their help strings. info Display information about current conda install. init Initialize conda for shell interaction. [Experimental] install Installs a list of packages into a specified conda environment. list List linked packages in a conda environment. package Low-level conda package utility. (EXPERIMENTAL) remove Remove a list of packages from a specified conda environment. uninstall Alias for conda remove. run Run an executable in a conda environment. [Experimental] search Search for packages and display associated information. The input is a MatchSpec, a query language for conda packages. See examples below. update Updates conda packages to the latest compatible version. upgrade Alias for conda update. optional arguments: -h, --help Show this help message and exit. -V, --version Show the conda version number and exit. conda commands available from other packages: build convert debug develop env index inspect metapackage render server skeleton verify
Installing Conda
You can install a local copy of miniconda in your home directory on our clusters. It will give you flexibility to install packages needed for the workflow.
Before installing Conda, please review the article about installing software in your personal home directory. It may help you to plan your installations better.
Here are the steps to follow:
Once connected to the login node, in your SSH session, make sure you are in your home directory:
$ cd
Create a "software" subdirectory for all custom software you are going to have:
$ mkdir software $ cd software
Create a directory for installation sources (if you do not have it yet) and download the latest Miniconda distribution file:
$ mkdir src $ cd src $ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Execute the installer, the .sh
file, and install miniconda:
$ bash Miniconda3-latest-Linux-x86_64.sh
Follow the instructions (choosing ~/software/miniconda3
as the directory to create),
agree to the license,
decline the offer to initialize.
Every time you launch a new terminal session and want to use this Conda install
you have to initialize it.
To make it easier you can create a short script
init-conda
it the ~/software
directory:
#! /bin/bash
ROOT=$HOME/my_software
THISROOT=$ROOT/miniconda3
eval "$(${THISROOT}/bin/conda shell.bash hook)"
You can use your favourite text editor, joe
or nano
for example, to create it:
$ joe ~/software/init-conda
or
$ nano ~/software/init-conda
Once you have the init script ready, you can activate your conda with the
$ source ~/software/init-conda
command.
You can check that it works by ensuring that it is using python installed inside your home directory
$ which python ~/software/miniconda3/bin/python
To deactivate conda, use the
$ conda deactivate
command.
Using Conda
Creating Conda environments
Create a virtual environment for your project
$ conda create -n <yourenvname>
Install additional Python packages to the virtual environment
$ conda install -n <yourenvname> [package]
Activate the virtual environment
$ source activate <yourenvname>
At this point you should be able to use your own python with the modules you added to it.
Example
After you login to ARC:
# Activate Conda using your own activation script
$ source ~/software/init-conda
(base) $ conda info
active environment : base
active env location : /home/username/software/miniconda3
.....
# Create a new environment based on python and tensorflow module.
(base) $ conda create -n tensorflow python tensorflow-gpu
....
# Once installed, activate the new environment for testing and work.
(base) $ conda activate tensorflow
# Test the installed software.
(tensorflow) $ python tensorflow-test.py
....
# Deactivate the environment.
(tensorflow) $ conda deactivate
# Deactivate Conda
(base) $ conda deactivate
$
Managing environments
List Conda environments:
$ conda env list # conda environments: # base * /home/username/my_software/miniconda3 pytorch /home/username/my_software/miniconda3/envs/pytorch tensorflow /home/username/my_software/miniconda3/envs/tensorflow /home/username/opt/my_env /home/username/opt/my_env2
In the example, 5 environments are listed.
The first 3 are named environments: base, pytorch and tensorflow.
The last two can only be referenced by the path the installed in:
~/opt/my_env
, and ~/opt/my_env2
.