This section provides information on how to install your Python packages with pip and conda in your user environment.
Note |
---|
Keep in mind that users do not have sudo permissions on the clusters. |
If you need assistance with installing software, please reach out to the support team for your cluster.
Table of Contents | ||||
---|---|---|---|---|
|
Python version
Various versions of Python are available on both clusters.
ALICE
The default version on ALICE is Python 3.6.8 (CentOS 7.X):
Code Block |
---|
[me@nodelogin02 ]$ python3 --version Python 3.6.8 |
If you need another version of Python (older or newer), you can load them as a module. You can find all available Python versions this way:
Code Block |
---|
module -r avail '^Python/' |
Info |
---|
On ALICE, several larger Python packages have been build as modules (e.g., SciPy, TensorFlow, PyTorch). which you can make use of or use as a starting point before installing additional packages. |
SHARK
The default version on ALICE is Python 3.6.8 (CentOS 8 Stream):
Code Block |
---|
[em@res-hpc-lo02 ]$ python3 --version Python 3.6.8 |
If you need another version of Python (older or newer), you can load them as a module. You can find all available Python versions this way:
Code Block |
---|
module avail /python/ |
Note |
---|
Since Python2 is end-of-life since January 2020, we will focus here on Python3 alone. |
pip
The Python package manager pip ( https://pypi.org/project/pip/ ) for Python3 can be accessed using the command pip3
if you are using the default installation. If you are using a Python module or a virtual environment, you can also use pip
Some useful commands are:
Command | Description | ||
---|---|---|---|
| Install package with the name | ||
| Install version 2.3 of the package with the name | ||
| Uninstall package with the name | ||
| Search for a package with the name | ||
| List all installed Python packages | ||
| Get information about using pip | ||
| Get information about the pip command |
If you run pip3 list
and you get the warning
Code Block |
---|
[me@res-hpc-lo02 ]$ pip3 list DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning. |
you can remove this warning by creating or editing your pip config in $HOME/.config/pip/pip.conf
and adding the following lines:
Code Block |
---|
[list] format=columns |
Installing packages with pip for a single project
If you only need a Python packages specific for a single project, you can install packages in your user environment with:
Code Block |
---|
pip install --user <packageName> |
This will install a Python package in the default user environment:
Code Block |
---|
$HOME/.local/lib/python3.6/site-packages |
Info |
---|
If you work with multiple projects or python packages which have conflicting dependencies, we recommend to use a virtual environment. |
Python Virtual Environments - For working multiple project
If you are working on different projects and you need different Python packages for each project, it is better to work in a special virtual environment.
When you activate this virtual environment, it will create a special virtual Python environment for you. In this virtual environment you can use the pip
command (without the --user option) and other commands.
You create a new virtual environment like this:
Code Block |
---|
python3 -m venv /path/to/new/virtual/environmentname |
where /path/to/new/virtual/environmentname
should be replaced by the path and name of your virtual environment.
You can activate a new virtual environment with the command:
Code Block |
---|
source /path/to/new/virtual/environmentname/bin/activate |
Note how your prompt has changed and shows you the name of the virtual environment in brackets.
You can deactivate a virtual environment with the command (it will not be destroyed):
Code Block |
---|
deactivate |
Example for setting up a Python Virtual Environment for pip
In this example, we will create a virtual environment on /exports/example/projects/Project-A
. This path is just an example will have to be replaced by valid path on SHARK or ALICE.
Code Block |
---|
[me@<login_node> ]$ python3 -m venv /exports/example/projects/Project-A Using base prefix '/usr' New python executable in /exports/example/projects/Project-A/bin/python3.6 Also creating executable in /exports/example/projects/Project-A/bin/python Installing setuptools, pip, wheel...done. |
Next, we activate the environment
Code Block |
---|
[me@<login_node> ]$ source /exports/example/projects/Project-A/bin/activate (Project-A) [me@<login_node> ]$ |
Info |
---|
Only when the virtual environment is active have you access to the packages therein. Therefore, you need to source it also in your Slurm batch file. |
Let’s list all the packages that are currently installed:
Code Block |
---|
(Project-A) [me@<login_node> ]$ pip3 list Package Version ---------- ------- pip 20.1.1 setuptools 49.1.0 wheel 0.34.2 |
Deactivating the environment goes like this:
Code Block |
---|
(Project-A) [me@<login_node> ]$ deactivate [me@<login_node> ]$ |
To remove your Python virtual environment, just delete the virtual environment directory:
Code Block |
---|
rm -Rf /path/to/virtual/environmentname |
Making your Python environment re-usable
Pip allows you to create a requirements file that contains all the packages in your current environment.
This makes it easy to re-install the environment should something go wrong or make sure that you have the same environment on your cluster and your workstation.
If you want to create the requirements file for a virtual environment, first activate the environment. Then, you can create the requirements file from your Python environment like this (using the example above):
Code Block |
---|
(Project-A) [me@<login_node> ]$ python3 -m pip freeze > requirements.txt |
You can re-install all packages in a new environment after creating the environment and activating it:
Code Block |
---|
python3 -m pip install -r requirements.txt |
Conda, Anaconda, Miniconda and Bioconda
If you have to install, setup and work with a complex program/project you should make use of Conda. Conda itself is a package management system ( https://docs.conda.io/en/latest/# ) , while anaconda, miniconda and bioconda provides you with a virtual Python environment and a lot of optimized Python packages, especially for researchers and scientists. These packages you can easily install within this environment.
Anaconda - collection with the most packages (> 7,500 data science and machine learning packages): https://www.anaconda.com/
Miniconda - light-weighted Anaconda version (you should start with this version): https://www.anaconda.com/
Bioconda - specializing in bio-informatics software: https://bioconda.github.io/#
Useful commands
Command | Desciption | ||
---|---|---|---|
| Verify Conda is installed, check version number | ||
| Create a new environment named ENVNAME | ||
| Create a conda environment at in
| ||
| Activate a named Conda environment | ||
| Deactivate current environment | ||
| List all packages and versions in the active environment | ||
| Delete an entire environment | ||
| Search for a package in currently configured channels | ||
| Install a package | ||
| Detailed information about package versions | ||
| Remove a package from an environment | ||
| Add a channel to your Conda configuration |
Conda on ALICE and SHARK
On ALICE and SHARK, you can make use of Conda by loading one of the available Miniconda modules.
ALICE
Code Block |
---|
module -r avail ^Miniconda |
SHARK
Code Block |
---|
module avail /miniconda/ |
Each version of Miniconda comes with a specific Python version though you can also update to a different Python version in the conda environment.
Default conda location
By default conda downloads and installs all packages in
Code Block |
---|
$HOME/.conda |
Note |
---|
Because Conda installs all necessary dependencies for packages including libraries, compilers, etc. Conda environments can become quite large (several GB). Therefore, Conda environments might not fit in your home directory or can at least take up a significant amount of space in it. In this case, it is best to install it in your scratch directory. |
Note |
---|
Conda downloads all packages which can also take significant disk space. If you do not want this or do not have sufficient space, you can tell conda to download the packages to another location |
Changing default conda settings
You can change the default settings for conda in your $HOME/.condarc
. If it does not exist yet, just create the file and add settings to it.
Installation directory
In order to change the default installation directory, add the following lines to your $HOME/.condarc
:
Code Block |
---|
envs_dirs: - /path/to/conda/envs |
Make sure the path exists before trying to install packages, e.g., by running mkdir -p /path/to/conda/envs
Download directory for packages
In order to change the default download directory for packages, add the following lines to your $HOME/.condarc
:
Code Block |
---|
pkgs_dirs: - /path/to/conda/pkgs |
Make sure the path exists before trying to install packages, e.g., by running mkdir -p /path/to/conda/pkgs
If you do not want to do this permanently, but just for the current session, you can also set the location using CONDA_PKGS_DIRS
, e.g.,
Code Block |
---|
export CONDA_PKGS_DIRS=/path/to/pkgs |
Disable automatically activating conda upon login
After running the init command, conda will be activated automatically when you login. Because this can lead to conflicts when using other modules, we recommend to disable this function and activate the environment only when needed.
You can disable this function in your $HOME/.condarc
Code Block |
---|
auto_activate_base: false |
You will still have access to the command conda
in order to activate a specific environment, create new ones, etc.
Making your conda environment re-usable:
A conda environment can be installed from an environment.yml
file. This allows you to quickly re-deploy your environment, share it with others or to make sure that you use the same environment on the cluster and your workstation.
First, activate the environment.
Then, create the file environment.yml
(you can also give the .yml
another name)
Code Block |
---|
conda env export > environment.yml |
The first line in the .yml
file contains the name of the environment.
In order to create a new environment based on your environment.yml
, run the following command:
Code Block |
---|
conda env create -f environment.yml |
Example for setting up a Conda environment
Here we provide a basic example for creating and managing your conda environment. Please see the Conda documentation for further information or alternative methods.
First, load a Miniconda module of your choice.
Next, you can create your first conda environment
Code Block |
---|
[me@<login_node> ]$ conda create /home/username/.conda/envs/Project-B Collecting package metadata (current_repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/username/.conda/envs/Project-B Proceed ([y]/n)? y Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use # # $ conda activate /home/username/.conda/envs/Project-B # # To deactivate an active environment, use # # $ conda deactivate |
Info |
---|
In this example, we did not change the default directory for our conda environment or specified a path with |
If you use conda for the first time, you need to initialize it for your shell:
Code Block |
---|
[me@<login_node> ]$ conda init bash no change /share/software/tools/miniconda/3.7/4.7.12/condabin/conda no change /share/software/tools/miniconda/3.7/4.7.12/bin/conda no change /share/software/tools/miniconda/3.7/4.7.12/bin/conda-env no change /share/software/tools/miniconda/3.7/4.7.12/bin/activate no change /share/software/tools/miniconda/3.7/4.7.12/bin/deactivate no change /share/software/tools/miniconda/3.7/4.7.12/etc/profile.d/conda.sh no change /share/software/tools/miniconda/3.7/4.7.12/etc/fish/conf.d/conda.fish no change /share/software/tools/miniconda/3.7/4.7.12/shell/condabin/Conda.psm1 no change /share/software/tools/miniconda/3.7/4.7.12/shell/condabin/conda-hook.ps1 no change /share/software/tools/miniconda/3.7/4.7.12/lib/python3.7/site-packages/xontrib/conda.xsh no change /share/software/tools/miniconda/3.7/4.7.12/etc/profile.d/conda.csh modified /home/username/.bashrc ==> For changes to take effect, close and re-open your current shell. <== |
Note that the paths shown above, will look different depending on the cluster and Miniconda module.
The init script has added certain commands to bashrc. After you have re-opened your shell, for example by logging in and out, your prompt will have changed to
Code Block |
---|
(base) [me@<login_node> ]$ |
The conda base environment is now active by default.
Note |
---|
An active conda environment upon login is not recommended because it can lead to conflicts when using other modules or environments. It is best to deactivate loading the base environment automatically (see and always only activate the environment in your batch file or as needed in your current session. |
Next, we activate the new environment
Code Block |
---|
[me@<login_node> ]$ conda activate Project-B (Project-B) [me@<login_node> ]$ |
Let’s search for a package
Code Block |
---|
[me@<login_node> ]$ conda search beautifulsoup4 Loading channels: done # Name Version Build Channel beautifulsoup4 4.6.0 py27_1 pkgs/main ... beautifulsoup4 4.9.1 py38_0 pkgs/main |
and install it:
Code Block |
---|
[me@<login_node> ]$ conda install beautifulsoup4 Collecting package metadata (current_repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/username/.conda/envs/Project-B added / updated specs: - beautifulsoup4 The following packages will be downloaded: package | build ---------------------------|----------------- beautifulsoup4-4.9.1 | py38_0 171 KB ca-certificates-2020.6.24 | 0 125 KB certifi-2020.6.20 | py38_0 156 KB libedit-3.1.20191231 | h7b6447c_0 167 KB libffi-3.3 | he6710b0_2 50 KB ncurses-6.2 | he6710b0_1 817 KB openssl-1.1.1g | h7b6447c_0 2.5 MB pip-20.1.1 | py38_1 1.7 MB python-3.8.3 | hcff3b4d_2 49.1 MB readline-8.0 | h7b6447c_0 356 KB setuptools-47.3.1 | py38_0 515 KB soupsieve-2.0.1 | py_0 33 KB sqlite-3.32.3 | h62c20be_0 1.1 MB tk-8.6.10 | hbc83047_0 3.0 MB xz-5.2.5 | h7b6447c_0 341 KB ------------------------------------------------------------ Total: 60.1 MB The following NEW packages will be INSTALLED: _libgcc_mutex pkgs/main/linux-64::_libgcc_mutex-0.1-main beautifulsoup4 pkgs/main/linux-64::beautifulsoup4-4.9.1-py38_0 ca-certificates pkgs/main/linux-64::ca-certificates-2020.6.24-0 certifi pkgs/main/linux-64::certifi-2020.6.20-py38_0 ld_impl_linux-64 pkgs/main/linux-64::ld_impl_linux-64-2.33.1-h53a641e_7 libedit pkgs/main/linux-64::libedit-3.1.20191231-h7b6447c_0 libffi pkgs/main/linux-64::libffi-3.3-he6710b0_2 libgcc-ng pkgs/main/linux-64::libgcc-ng-9.1.0-hdf63c60_0 libstdcxx-ng pkgs/main/linux-64::libstdcxx-ng-9.1.0-hdf63c60_0 ncurses pkgs/main/linux-64::ncurses-6.2-he6710b0_1 openssl pkgs/main/linux-64::openssl-1.1.1g-h7b6447c_0 pip pkgs/main/linux-64::pip-20.1.1-py38_1 python pkgs/main/linux-64::python-3.8.3-hcff3b4d_2 readline pkgs/main/linux-64::readline-8.0-h7b6447c_0 setuptools pkgs/main/linux-64::setuptools-47.3.1-py38_0 soupsieve pkgs/main/noarch::soupsieve-2.0.1-py_0 sqlite pkgs/main/linux-64::sqlite-3.32.3-h62c20be_0 tk pkgs/main/linux-64::tk-8.6.10-hbc83047_0 wheel pkgs/main/linux-64::wheel-0.34.2-py38_0 xz pkgs/main/linux-64::xz-5.2.5-h7b6447c_0 zlib pkgs/main/linux-64::zlib-1.2.11-h7b6447c_3 Proceed ([y]/n)? y Downloading and Extracting Packages libedit-3.1.20191231 | 167 KB | #################################### | 100% sqlite-3.32.3 | 1.1 MB | #################################### | 100% readline-8.0 | 356 KB | #################################### | 100% pip-20.1.1 | 1.7 MB | #################################### | 100% python-3.8.3 | 49.1 MB | #################################### | 100% certifi-2020.6.20 | 156 KB | #################################### | 100% ncurses-6.2 | 817 KB | #################################### | 100% ca-certificates-2020 | 125 KB | #################################### | 100% setuptools-47.3.1 | 515 KB | #################################### | 100% xz-5.2.5 | 341 KB | #################################### | 100% openssl-1.1.1g | 2.5 MB | #################################### | 100% libffi-3.3 | 50 KB | #################################### | 100% soupsieve-2.0.1 | 33 KB | #################################### | 100% beautifulsoup4-4.9.1 | 171 KB | #################################### | 100% tk-8.6.10 | 3.0 MB | #################################### | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done |
Let’s list all the packages that have been installed:
Code Block |
---|
[me@<login_node> ]$ conda list # packages in environment at /home/username/.conda/envs/Project-B: # # Name Version Build Channel _libgcc_mutex 0.1 main beautifulsoup4 4.9.1 py38_0 ca-certificates 2020.6.24 0 certifi 2020.6.20 py38_0 ld_impl_linux-64 2.33.1 h53a641e_7 libedit 3.1.20191231 h7b6447c_0 libffi 3.3 he6710b0_2 libgcc-ng 9.1.0 hdf63c60_0 libstdcxx-ng 9.1.0 hdf63c60_0 ncurses 6.2 he6710b0_1 openssl 1.1.1g h7b6447c_0 pip 20.1.1 py38_1 python 3.8.3 hcff3b4d_2 readline 8.0 h7b6447c_0 setuptools 47.3.1 py38_0 soupsieve 2.0.1 py_0 sqlite 3.32.3 h62c20be_0 tk 8.6.10 hbc83047_0 wheel 0.34.2 py38_0 xz 5.2.5 h7b6447c_0 zlib 1.2.11 h7b6447c_3 |
Uninstalling a package can be done like this:
Code Block |
---|
[me@<login_node> ]$ conda uninstall -y beautifulsoup4 Collecting package metadata (repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/username/.conda/envs/Project-B removed specs: - beautifulsoup4 The following packages will be REMOVED: beautifulsoup4-4.9.1-py38_0 soupsieve-2.0.1-py_0 Preparing transaction: done Verifying transaction: done Executing transaction: done |
Deactivating a conda environment goes like this:
Code Block |
---|
(Project-B) [me@<login_node> ]$ conda deactivate [me@<login_node> ]$ |
Removing a conda enivronment can be done in this manner:
Code Block |
---|
[me@<login_node> ]$ conda remove --name Project-B --all -y Remove all packages in environment /home/username/.conda/envs/Project-B: ## Package Plan ## environment location: /home/username/.conda/envs/Project-B The following packages will be REMOVED: _libgcc_mutex-0.1-main ca-certificates-2020.6.24-0 certifi-2020.6.20-py38_0 ld_impl_linux-64-2.33.1-h53a641e_7 libedit-3.1.20191231-h7b6447c_0 libffi-3.3-he6710b0_2 libgcc-ng-9.1.0-hdf63c60_0 libstdcxx-ng-9.1.0-hdf63c60_0 ncurses-6.2-he6710b0_1 openssl-1.1.1g-h7b6447c_0 pip-20.1.1-py38_1 python-3.8.3-hcff3b4d_2 readline-8.0-h7b6447c_0 setuptools-47.3.1-py38_0 sqlite-3.32.3-h62c20be_0 tk-8.6.10-hbc83047_0 wheel-0.34.2-py38_0 xz-5.2.5-h7b6447c_0 zlib-1.2.11-h7b6447c_3 Preparing transaction: done Verifying transaction: done Executing transaction: done |
Activating a conda environment in a slurm batch job
When you have set up your conda environment, you can activate it in slurm batch job for example like this:
Code Block |
---|
# using the name of the environment (replace <env_name>) conda activate <env_name> # using the path to the environment (replace the path) conda activate /path/to/my/conda/environment |
In addition, you need to make sure that your slurm job invokes a login shell. Otherwise, conda will not source your environment and report an error that you have not yet ran conda init
. This can be achieved by using
Code Block |
---|
#!/bin/bash -l |
instead of #!/bin/bash
at the beginning of your batch script.