Conda environments
Aim: Provide information about how to install software without root privileges and/or need to install a custom kernel for a Jupyter Notebook.
Target audience: Users who need to specific software environments.
Introduction
Conda is an open source package and environment manager. It helps find and install packages without needing root privileges (i.e., with a simple conda install...
). Conda environments can be extremely useful when building software. It is part of the Anaconda package which is available on all CPU and GPU interactive nodes]. Anaconda is a software distribution tool for Python packages.1
Conda is also useful when adding customized kernels to JupyterLab sessions (the next-generation of a Jupyter Notebook) and Jupyter Notebooks, which can be used from the Nikhef JupyterHub service.
Usage
Note on software toolchains
Software toolchains (of which many LHC experiments and non-LHC experiments have their own toolchains) consist of a number of pieces of software required to build other software. This generally comprises the C/C++ libraries; a C/C++ compiler; tools to handle, load and execute binary files and a set of kernel headers.
For example the default toolchain available on machines running CentOS7 consists of:
gcc 4.8.5
glibc 2.17
binutils 2.27
kernel-headers 3.10.0
To keep up with demands on the toolchain of various dependencies of python packages, Anaconda ships its own (modern) toolchain.
Using Conda on your laptop
To install anaconda, the Miniforge version is recommended, which sets conda-forge as the default channel. Download an installer from this page, and install it on your laptop; for example:
> wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
> bash Miniforge3-Linux-x86_64.sh -b -p /somewhere/on/your/laptop
The final step of the conda installation will modify your shell startup scripts to automatically make the conda command available. You will have to log out and back in to see these changes.
Creating a virtual environment with Conda
Because the software environment can be regenerated, we recommend creating a virtual environment (venv) in your group directory on /data
instead of /project
or your home directory.
The python version you need can also be specified when creating the venv:
> conda create --prefix /data/your_project/your_username/my_venv python=3.9
Once the virtual environment has been created, you can activate it using:
> conda activate /data/your_project/your_username/my_venv
Installing Python packages inside the conda environment
Once the virtual environment has been created and activated, it is ready to be used. The first step is usually to install additional software into the environment. There are two ways of doing this: with conda
and with pip
. The rule of thumb is: try with conda
first and if a package is not available, try with pip
.
The main reason for this ordering is that for some packages pip
uses the C/C++/Fortran toolchain to build libraries that are then loaded into python. It, however, doesn’t know about the (newer) toolchain installed by conda and will use the one on the (CentOS 7) host instead. For quite a few packages this toolchain is too old and the build will fail. When packages are installed with conda
, they have been built using the conda toolchain and will therefore work anyway.
Installing packages via conda
To install additional software into the virtual environment using conda itself: activate it and install:
> conda activate /data/your_project/your_username/my_venv
> conda install root
> conda search root
Installing packages via pip
To install additional software into the virtual environment with pip: activate it, install pip itself and then use it to install other software; e.g.:
> conda activate /data/your_project/your_username/my_venv
> conda install pip
> pip install keras
> pip install tensorflow-gpu
Using the software in the new venv
Once things are installed, they can be used directly:
> python
Python 3.9.7 (default, Sep 16 2021, 13:09:58)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from ROOT import gROOT
>>> gROOT.GetVersion()
'6.24/06'
>>>
Links
Contact
- Email pdp@nikhef.nl for help setting up virtual or conda environments.
-
The distribution happens through so-called "channels". The default channel is administered by the developers of Anaconda: Anaconda Inc. There is also a community-managed channel called conda-forge and anybody can create and use their own channel. ↩