Conda on ARC: Difference between revisions
Line 184: | Line 184: | ||
If something goes wrong, a derived environment can be removed and recreated, relatively easily. | If something goes wrong, a derived environment can be removed and recreated, relatively easily. | ||
if something wrong with the base environment, the entire Conda setup, likely including other Conda environments, will have to be reinstalled. | if something wrong with the base environment, the entire Conda setup, likely including other Conda environments, will have to be reinstalled. | ||
This is an example '''job script''' which does Conda initialization and a custom environment activation. | This is an example '''job script''' which does Conda initialization and a custom environment activation. |
Revision as of 17:45, 16 October 2024
Background
Conda is a tool for managing and deploying applications, environments and packages.
Miniforge is a totally free alternative Conda installer which is maintained by the conda-forge channel.
Miniforge version of Conda sets the conda-forge channel as the default source of packages.
While the Conda package manager is open source and free,
downloading and installing software from the defaults and anaconda channels is not and requires payments.
Please use the Miniforge installer over the Miniconda one.
Some points on using Conda
- Conda is a package manager and installer.
- It has to be installed once, and then it can be used for managing the software the user wants.
- Python is not a part of Conda.
- Conda uses environments to separate software installations to prevent possible conflicts and incompatibilities.
- Different software packages are to be installed into different environments.
- Before one can use a package installed into an environment, the environment has to be activated.
- Only the software installed into that environment will be available after the activation.
- If a different environment needs to be activated, please make sure that the current environment is deactivated.
- If multiple environments need a specific module or library, this module or library has to be installed multiple times.
- Each environment is independent and separate from other environments, thus a module installed in one environment will not be available in a different environment.
- Environments can be organized based on activities, rather than software.
- If you are sure that several software packages you want do not interfere with each other, you can have them installed in the same environment, if this fits your usage pattern.
- For example, you may want to have both Python and R installed in the same environment, if you typically use them both in the same activity.
Brief help message
$ conda --help usage: conda [-h] [-V] command ... conda is a tool for managing and deploying applications, environments and packages. Options: positional arguments: command clean Remove unused packages and caches. config Modify configuration values in .condarc. This is modeled after the git config command. Writes to the user .condarc file (/home/drozmano/.condarc) by default. create Create a new conda environment from a list of specified packages. help Displays a list of available conda commands and their help strings. info Display information about current conda install. init Initialize conda for shell interaction. [Experimental] install Installs a list of packages into a specified conda environment. list List linked packages in a conda environment. package Low-level conda package utility. (EXPERIMENTAL) remove Remove a list of packages from a specified conda environment. uninstall Alias for conda remove. run Run an executable in a conda environment. [Experimental] search Search for packages and display associated information. The input is a MatchSpec, a query language for conda packages. See examples below. update Updates conda packages to the latest compatible version. upgrade Alias for conda update. optional arguments: -h, --help Show this help message and exit. -V, --version Show the conda version number and exit. conda commands available from other packages: build convert debug develop env index inspect metapackage render server skeleton verify
Conda on ARC
Installing Conda
You can install a local copy of miniforge (miniconda) in your home directory on our clusters.
It will give you flexibility to install packages needed for the workflow.
Before installing Conda, please review the article about installing software in your personal home directory. It may help you to plan your installations better.
IMPORTANT! When you follow these steps, it is VERY IMPORTANT to DECLINE the offer by the installer
to modify your account, so that conda is automatically activated.
Automatic activation leads to multiple potential problems later in your work.
DECLINE the offer by the conda installer.
Here are the steps to follow:
Once connected to the login node, in your SSH session, make sure you are in your home directory:
$ cd
Create a "software" subdirectory for all custom software you are going to have:
$ mkdir software $ cd software
Create a directory for installation sources (if you do not have it yet) and
download the latest Miniforge (Miniconda) distribution file:
$ mkdir src $ cd src $ wget https://github.com/conda-forge/miniforge/releases/download/24.3.0-0/Miniforge3-24.3.0-0-Linux-x86_64.sh
Execute the installer, the .sh
file, and install miniforge (miniconda):
$ bash Miniforge3-24.3.0-0-Linux-x86_64.sh
Follow the instructions (choosing ~/software/miniforge3
as the directory to create),
agree to the license,
decline the offer to initialize.
Every time you launch a new terminal session and want to use this Conda install
you have to initialize it.
To make it easier you can create a short script
init-conda
it the ~/software
directory:
#! /bin/bash
eval "$(~/software/miniforge3/bin/conda shell.bash hook)"
You can use your favourite text editor, nano
for example, to create it:
$ nano ~/software/init-conda
Once you have the init script ready, you can activate your conda with the
$ source ~/software/init-conda (base) $
command.
You can check that it works by ensuring that it is using python installed inside your home directory
(base) $ which python ~/software/miniforge3/bin/python (base) $ python --version Python 3.9.18
The version of python depends on when you downloaded the conda installation file.
To deactivate conda, use the
$ conda deactivate
command.
Using Conda Environments in SLURM jobs
Any new software you are going to install using Conda has to be installed in a special Conda environment.
You should never install anything it the (base)
environment,
this environment is the system enivronment that provide Conda funtionality,
it is required for Conda to function properly.
New software packages for the programs you want to use have to be installed in derived Conda environments.
The derived environments can hold just one software package,
or a set of programs that can be installed together (without conflicts) and which are used for the same purpose / activity.
If something goes wrong, a derived environment can be removed and recreated, relatively easily. if something wrong with the base environment, the entire Conda setup, likely including other Conda environments, will have to be reinstalled.
This is an example job script which does Conda initialization and a custom environment activation.
The init-conda
script is used to activate Conda itself,
then Conda is used to activate the enviroment, custom_env
.
conda-job.slurm
:
#! /bin/bash
# ====================================
#SBATCH --job-name=conda-test
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=4GB
#SBATCH --time=0-01:00:00
# ====================================
# Activate Conda and then the environment.
source ~/software/init-conda
conda activate custom_env
# Use the software here.
....
....
Using Conda
Activating Base Conda
This is a more detailed rehearsal of the example shown in the installation procedure.
Once you installed your own Miniforge3
, in the directory of your choice,
it has to be activated before you can use it.
It has to be in every session you want to use it on ARC's login node,
or in every job script your will be submitting to ARC, if the job needs to rely on your Conda environments.
Let us assume, that Conda is installed into the ~/software/miniforge3
sub-directory
(~
indicates your home directory).
Then to activate it you can use the following command on the command line:
[username@arc ~]$ eval "$(~/software/miniforge3/bin/conda shell.bash hook)" (base) [username@arc ~]$
If you do not need it any more, you can deactivate your Conda with the following command:
(base) [username@arc ~]$ conda deactivate [username@arc ~]$
To avoid typing this cryptic command every time you need your Conda,
you can save it into a shell script file, init-conda
,
and place it in some handy location in your home directory,
~/software
, for example.
~/software/init-conda
:
#! /bin/bash
eval "$(~/software/miniforge3/bin/conda shell.bash hook)"
Once you have the init script, you can use it, instead, to get your Conda install active:
[username@arc ~]$ source ~/software/init-conda (base) [username@arc ~]$
Updating Conda
You can update your Conda by using the update
conda command.
Below is shown an example session, which updates 'conda from
version 23.3.1 to version 23.9.0.
You have to activate your base conda first.
(base) [username@arc ~]$ conda --version conda 23.3.1 (base) [username@arc ~]$ conda update conda Collecting package metadata (current_repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/username/software/miniforge3 added / updated specs: - conda The following packages will be downloaded: package | build ---------------------------|----------------- brotli-python-1.0.9 | py39h6a678d5_7 330 KB ca-certificates-2023.08.22 | h06a4308_0 123 KB .... .... wheel-0.41.2 | py39h06a4308_0 108 KB ------------------------------------------------------------ Total: 38.5 MB The following NEW packages will be INSTALLED: brotli-python pkgs/main/linux-64::brotli-python-1.0.9-py39h6a678d5_7 The following packages will be REMOVED: brotlipy-0.7.0-py39h27cfd23_1003 .... yaml-0.2.5-h7b6447c_0 The following packages will be UPDATED: ca-certificates 2023.01.10-h06a4308_0 --> 2023.08.22-h06a4308_0 .... .... wheel 0.38.4-py39h06a4308_0 --> 0.41.2-py39h06a4308_0 Proceed ([y]/n)? y Downloading and Extracting Packages Preparing transaction: done Verifying transaction: done Executing transaction: done (base) [drozmano@arc ~]$ conda --version conda 23.9.0
Creating Conda environments
Create a virtual environment for your project
$ conda create -n <yourenvname>
Install additional Python packages to the virtual environment
$ conda install -n <yourenvname> [package]
Activate the virtual environment
$ source activate <yourenvname>
At this point you should be able to use your own python with the modules you added to it.
Example
After you login to ARC:
# Activate Conda using your own activation script
$ source ~/software/init-conda
(base) $ conda info
active environment : base
active env location : /home/username/software/miniforge3
.....
# Create a new environment based on python and tensorflow module.
(base) $ conda create -n tensorflow python tensorflow-gpu
....
# Once installed, activate the new environment for testing and work.
(base) $ conda activate tensorflow
# Test the installed software.
(tensorflow) $ python tensorflow-test.py
....
# Deactivate the environment.
(tensorflow) $ conda deactivate
# Deactivate Conda
(base) $ conda deactivate
$
Managing environments
Get help
$ conda env --help $ conda env list --help $ conda env remove --help
List Conda environments
$ conda env list # conda environments: # base * /home/username/my_software/miniforge3 pytorch /home/username/my_software/miniforge3/envs/pytorch tensorflow /home/username/my_software/miniforge3/envs/tensorflow /home/username/opt/my_env /home/username/opt/my_env2
In the example, 5 environments are listed.
The first 3 are named environments: base, pytorch and tensorflow.
The last two can only be referenced by the path the installed in:
~/opt/my_env
, and ~/opt/my_env2
.
Remove an environment
Remove a named environment:
$ conda env remove -n pytorch
or, using an environment path:
$ conda env remove -p ~/opt/my_env
Getting info about environments
List packages installed in the current environment:
$ conda list # packages in environment at /home/username/my_software/miniforge3/envs/tensorflow: # # Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 4.5 1_gnu _tflow_select 2.1.0 gpu .... ....
List packages matching a pattern:
$ conda list tensorflow # packages in environment at /home/username/my_software/miniforge3/envs/tensorflow: # # Name Version Build Channel tensorflow 2.4.1 gpu_py39h8236f22_0 tensorflow-base 2.4.1 gpu_py39h29c2da4_0 tensorflow-estimator 2.6.0 pyh7b7c402_0 tensorflow-gpu 2.4.1 h30adc30_0