Conda on ARC

From RCSWiki
Jump to navigation Jump to search

Background

Conda is a tool for managing and deploying applications, environments and packages.


Some points on using Conda

  • Conda is a package manager and installer.
It has to be installed once, and then it can be used for managing the software the user wants.


  • Python is not a part of Conda.


  • Conda uses environments to separate software installations to prevent possible conflicts and incompatibilities.


  • Different software has to be installed into different environment.


  • Before one can use a package installed into an environment, the environment has to be activated.
Only the software installed into that environment will be available after the activation.


  • If a different environment needs to be activated, please make sure that the current environment is deactivated.


  • If multiple environments need a specific module or library, this module or library has to be installed multiple times.
Each environment is independent and separate from other environments, thus a module installed in one environment will not be available in a different environment.


  • Environments can be organized based on activities, rather than software.
If you are sure that several software packages you want do not interfere with each other, you can have them installed in the same environment, if this fits your usage pattern.
For example, you may want to have both Python and R installed in the same environment, if you typically use them both in the same activity.

Brief help message

$ conda --help
usage: conda [-h] [-V] command ...

conda is a tool for managing and deploying applications, environments and packages.

Options:

positional arguments:
  command
    clean        Remove unused packages and caches.
    config       Modify configuration values in .condarc. This is modeled
                 after the git config command. Writes to the user .condarc
                 file (/home/drozmano/.condarc) by default.
    create       Create a new conda environment from a list of specified
                 packages.
    help         Displays a list of available conda commands and their help
                 strings.
    info         Display information about current conda install.
    init         Initialize conda for shell interaction. [Experimental]
    install      Installs a list of packages into a specified conda
                 environment.
    list         List linked packages in a conda environment.
    package      Low-level conda package utility. (EXPERIMENTAL)
    remove       Remove a list of packages from a specified conda environment.
    uninstall    Alias for conda remove.
    run          Run an executable in a conda environment. [Experimental]
    search       Search for packages and display associated information. The
                 input is a MatchSpec, a query language for conda packages.
                 See examples below.
    update       Updates conda packages to the latest compatible version.
    upgrade      Alias for conda update.

optional arguments:
  -h, --help     Show this help message and exit.
  -V, --version  Show the conda version number and exit.

conda commands available from other packages:
  build
  convert
  debug
  develop
  env
  index
  inspect
  metapackage
  render
  server
  skeleton
  verify

Installing Conda

You can install a local copy of miniconda in your home directory on our clusters. It will give you flexibility to install packages needed for the workflow.

Before installing Conda, please review the article about installing software in your personal home directory. It may help you to plan your installations better.


Here are the steps to follow:

Once connected to the login node, in your SSH session, make sure you are in your home directory:

$ cd 

Create a "software" subdirectory for all custom software you are going to have:

$ mkdir software
$ cd software 

Create a directory for installation sources (if you do not have it yet) and download the latest Miniconda distribution file:

$ mkdir src
$ cd src
$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

Execute the installer, the .sh file, and install miniconda:

$ bash Miniconda3-latest-Linux-x86_64.sh

Follow the instructions (choosing ~/software/miniconda3 as the directory to create), agree to the license, decline the offer to initialize.

Every time you launch a new terminal session and want to use this Conda install you have to initialize it. To make it easier you can create a short script
init-conda it the ~/software directory:

#! /bin/bash
ROOT=$HOME/my_software
THISROOT=$ROOT/miniconda3

eval "$(${THISROOT}/bin/conda shell.bash hook)"

You can use your favourite text editor, joe or nano for example, to create it:

$ joe ~/software/init-conda 

or

$ nano ~/software/init-conda

Once you have the init script ready, you can activate your conda with the

$ source ~/software/init-conda

command.

You can check that it works by ensuring that it is using python installed inside your home directory

$ which python 
~/software/miniconda3/bin/python

To deactivate conda, use the

$ conda deactivate

command.

Using Conda

Creating Conda environments

Create a virtual environment for your project

$ conda create -n <yourenvname>

Install additional Python packages to the virtual environment

$ conda install -n <yourenvname> [package]

Activate the virtual environment

$ source activate <yourenvname>

At this point you should be able to use your own python with the modules you added to it.

Example

After you login to ARC:

# Activate Conda using your own activation script
$ source ~/software/init-conda

(base) $ conda info
     active environment : base
    active env location : /home/username/software/miniconda3
.....

# Create a new environment based on python and tensorflow module.
(base) $ conda create -n tensorflow python tensorflow-gpu
....

# Once installed, activate the new environment for testing and work.
(base) $ conda activate tensorflow

# Test the installed software.
(tensorflow) $ python tensorflow-test.py
.... 

# Deactivate the environment.
(tensorflow) $ conda deactivate 

# Deactivate Conda
(base) $ conda deactivate 

$

Managing environments

List Conda environments:

$ conda env list

# conda environments:
#
base                  *  /home/username/my_software/miniconda3
pytorch                  /home/username/my_software/miniconda3/envs/pytorch
tensorflow               /home/username/my_software/miniconda3/envs/tensorflow
                         /home/username/opt/my_env
                         /home/username/opt/my_env2

In the example, 5 environments are listed.

The first 3 are named environments: base, pytorch and tensorflow.

The last two can only be referenced by the path the installed in: ~/opt/my_env, and ~/opt/my_env2.


Remove an environment:

$ conda env remove -n pytorch

or

$ conda env remove -p ~/opt/my_env