Installing Python packages

This section provides information on how to install your Python packages with pip and conda in your user environment.

 

Keep in mind that users do not have sudo permissions on the clusters.

If you need assistance with installing software, please reach out to the support team for your cluster.

Python version

Various versions of Python are available on both clusters.


ALICE

The default version on ALICE is Python 3.6.8 (CentOS 7.X):

[me@nodelogin02 ]$ python3 --version Python 3.6.8

If you need another version of Python (older or newer), you can load them as a module. You can find all available Python versions this way:

module -r avail '^Python/'

On ALICE, several larger Python packages have been build as modules (e.g., SciPy, TensorFlow, PyTorch). which you can make use of or use as a starting point before installing additional packages.


SHARK

The default version on ALICE is Python 3.6.8 (CentOS 8 Stream):

[em@res-hpc-lo02 ]$ python3 --version Python 3.6.8

If you need another version of Python (older or newer), you can load them as a module. You can find all available Python versions this way:


Since Python2 is end-of-life since January 2020, we will focus here on Python3 alone.

pip

The Python package manager pip ( pip ) for Python3 can be accessed using the command pip3 if you are using the default installation. If you are using a Python module or a virtual environment, you can also use pip

Some useful commands are:

Command

Description

Command

Description

Install package with the name <packageName>

Install version 2.3 of the package with the name <packageName>

Uninstall package with the name <packageName>

Search for a package with the name <packageName>

List all installed Python packages

Get information about using pip

Get information about the pip command install

If you run pip3 list and you get the warning

you can remove this warning by creating or editing your pip config in $HOME/.config/pip/pip.conf and adding the following lines:

Installing packages with pip for a single project

If you only need a Python packages specific for a single project, you can install packages in your user environment with:

This will install a Python package in the default user environment:

Python Virtual Environments - For working multiple project

If you are working on different projects and you need different Python packages for each project, it is better to work in a special virtual environment.

When you activate this virtual environment, it will create a special virtual Python environment for you. In this virtual environment you can use the pip command (without the --user option) and other commands.

You create a new virtual environment like this:

where /path/to/new/virtual/environmentname should be replaced by the path and name of your virtual environment.

You can activate a new virtual environment with the command:

Note how your prompt has changed and shows you the name of the virtual environment in brackets.

You can deactivate a virtual environment with the command (it will not be destroyed):

Example for setting up a Python Virtual Environment for pip

In this example, we will create a virtual environment on /exports/example/projects/Project-A. This path is just an example will have to be replaced by valid path on SHARK or ALICE.

Next, we activate the environment

Let’s list all the packages that are currently installed:

Deactivating the environment goes like this:

To remove your Python virtual environment, just delete the virtual environment directory:

Making your Python environment re-usable

Pip allows you to create a requirements file that contains all the packages in your current environment.

This makes it easy to re-install the environment should something go wrong or make sure that you have the same environment on your cluster and your workstation.

If you want to create the requirements file for a virtual environment, first activate the environment. Then, you can create the requirements file from your Python environment like this (using the example above):

You can re-install all packages in a new environment after creating the environment and activating it:

Conda, Anaconda, Miniconda and Bioconda

If you have to install, setup and work with a complex program/project you should make use of Conda. Conda itself is a package management system ( Conda Documentation — conda-docs documentation ) , while anaconda, miniconda and bioconda provides you with a virtual Python environment and a lot of optimized Python packages, especially for researchers and scientists. These packages you can easily install within this environment.

Useful commands

Command

Desciption

Command

Desciption

conda info

Verify Conda is installed, check version number

conda create --name ENVNAME

Create a new environment named ENVNAME

conda create --prefix /path/to/ENVNAME

Create a conda environment at in /path/to/ENVNAME instead of the default location.

conda activate ENVNAME

Activate a named Conda environment

conda deactivate

Deactivate current environment

conda list

List all packages and versions in the active environment

conda remove --name ENVNAME --all

Delete an entire environment

conda search PKGNAME

Search for a package in currently configured channels

conda install PKGNAME

Install a package

conda search PKGNAME --info

Detailed information about package versions

conda uninstall PKGNAME --name ENVNAME

Remove a package from an environment

conda config --add channels CHANNELNAME

Add a channel to your Conda configuration

Conda on ALICE and SHARK

On ALICE and SHARK, you can make use of Conda by loading one of the available Miniconda modules.


ALICE


SHARK


Each version of Miniconda comes with a specific Python version though you can also update to a different Python version in the conda environment.

Default conda location

By default conda downloads and installs all packages in

Changing default conda settings

You can change the default settings for conda in your $HOME/.condarc. If it does not exist yet, just create the file and add settings to it.

Installation directory

In order to change the default installation directory, add the following lines to your $HOME/.condarc:

Make sure the path exists before trying to install packages, e.g., by running mkdir -p /path/to/conda/envs

Download directory for packages

In order to change the default download directory for packages, add the following lines to your $HOME/.condarc:

Make sure the path exists before trying to install packages, e.g., by running mkdir -p /path/to/conda/pkgs

If you do not want to do this permanently, but just for the current session, you can also set the location using CONDA_PKGS_DIRS, e.g.,

Disable automatically activating conda upon login

After running the init command, conda will be activated automatically when you login. Because this can lead to conflicts when using other modules, we recommend to disable this function and activate the environment only when needed.

You can disable this function in your $HOME/.condarc

You will still have access to the command conda in order to activate a specific environment, create new ones, etc.

Making your conda environment re-usable:

A conda environment can be installed from an environment.yml file. This allows you to quickly re-deploy your environment, share it with others or to make sure that you use the same environment on the cluster and your workstation.

First, activate the environment.

Then, create the file environment.yml (you can also give the .yml another name)

The first line in the .yml file contains the name of the environment.

In order to create a new environment based on your environment.yml, run the following command:

Example for setting up a Conda environment

Here we provide a basic example for creating and managing your conda environment. Please see the Conda documentation for further information or alternative methods.

First, load a Miniconda module of your choice.

Next, you can create your first conda environment

If you use conda for the first time, you need to initialize it for your shell:

Note that the paths shown above, will look different depending on the cluster and Miniconda module.

The init script has added certain commands to bashrc. After you have re-opened your shell, for example by logging in and out, your prompt will have changed to

The conda base environment is now active by default.

Next, we activate the new environment

Let’s search for a package

and install it:

Let’s list all the packages that have been installed:

Uninstalling a package can be done like this:

Deactivating a conda environment goes like this:

Removing a conda enivronment can be done in this manner:

Activating a conda environment in a slurm batch job

When you have set up your conda environment, you can activate it in slurm batch job for example like this:

In addition, you need to make sure that your slurm job invokes a login shell. Otherwise, conda will not source your environment and report an error that you have not yet ran conda init. This can be achieved by using

instead of #!/bin/bash at the beginning of your batch script.