Using available software - Environment Modules

This section explains how to make use of the system-wide available software stack which is organized in modules.

A module is a self-contained description of a software package - it contains the settings required to run a software package and, usually, encodes required dependencies on other software packages.

There are a number of different environment module implementations commonly used on HPC systems: the two most common are TCL modules and Lmod. Both of these use similar syntax and the concepts are the same so learning to use one will allow you to use whichever is installed on the system you are using.

Both ALICE and SHARK, use Lmod.

The module system makes it easy to switch between different software for different jobs and access multiple versions of the same software.

Basic usage

This table provides a short overview of the most commonly used options for the command module. Check the help for module to see all commands with a description.

Command

Description

Command

Description

module -h or module --help

Get help message

module list

Get a list of all currently loaded modules

module avail or module av

Get list of all available modules (See below for details)

module avail <string> or module av <string>

Get list of all available modules which contain <string> in their name (See below for details)

module spider <string>

Get list of all modules matching <string> and their description

module whatis <module>

Get the description of the given <module>

module load <module>

Load the specified <module> (See below for details)

module unload <module>

Unload the module <module> (See below for details)

module reset

Reset module environment to default list of modules (See below for details)

module purge

Unload all loaded modules (See below for details)

module save

Save the current list of modules, e.g., as a default environment

module restore

Restore module environment to a default list of modules

The module structure on ALICE and SHARK

ALICE and SHARK have different module structures. This is partly related to how the system-wide software stack is being managed on both clusters. However, the basic usage of environment modules is the same on both clusters.

Once you are familiar with how modules are organized on the cluster that you are using, it will not be difficult for you to find the modules that you are looking for.


ALICE

On ALICE modules are organized as different module trees. The software behind each module is build and maintained using EasyBuild (https://easybuild.io/).

In each tree, the modules are organized as <name>/<version> where <version> can also include additional information about what was used to build the module <name>

The default trees are

Tree

Description

Tree

Description

/cm/shared/easybuild/modules/all

or

/cm/shared/easybuild/GenuineIntel/modules/all

Default tree, containing all of the system-wide software build with EasyBuild for nodes with Intel CPU. Available by default or after loading the ALICE/Intel module or on Intel-powered nodes the ALICE/default module.

/cm/shared/modulefiles

Containing the modules for Slurm and LUMC/FSL software

Additionally available trees that need to be loaded manually are:

Tree

Description

Tree

Description

/cm/shared/LUMC/modulefiles

Containing all LUMC-provided modules including FSL. Available after loading the module LUMC

/cm/shared/easybuild/AuthenticAMD/modules/all

Containing all EasyBuild-build modules for nodes with AMD CPUs. Available after loading the ALICE/AMD module or on AMD-powered nodes the ALICE/default module


SHARK

On SHARK, all software is contained in the tree /share/modulefiles

Modules are being organized in a hierarchical structure, i.e., <moduleclass>/<name>/<version> where <moduleclass> is the class to which the module belong to, <name> the name of the module, and <version> the installed version of the module <name>

The different module classes currently are:

  • benchmark

  • bioimage

  • bioinformatics

  • container

  • cryogenicEM

  • genomics

  • graphics

  • gwas

  • library

  • mathematical

  • medicalImaging

  • neuroImaging

  • pharmaceutical

  • statistical

  • system

  • tools


List currently loaded modules

On login, you might start out with a default set of modules.

The module list command shows which modules you currently have loaded in your environment.


ALICE

This is what you will most likely see after logging into ALICE.

[me@nodelogin02~]$ module list Currently Loaded Modules: 1) shared 2) DefaultModules 3) gcc/8.2.0 4) slurm/19.05.1

You can see that by default the module for Slurm and the gcc compiler is loaded. The number behind the slash sign represent the version number.


SHARK

On SHARK, no modules are being loaded by default. After login, you would see

[me@res-hpc-lo02 ~]$ module list No modules loaded

List available modules

To see the available modules, use module -d avail. With the -d option, you will only get the default versions of the modules. For various software packages, there are also older/other versions available, that might be used if necessary. You can see all version by omitting the -d option.


ALICE

[me@nodelogin02~]$ module -d avail ---------------------------------------------------------------------- /cm/shared/easybuild/GenuineIntel/modules/all ---------------------------------------------------------------------- 4ti2/1.6.9-GCC-8.2.0-2.31.1 Miniconda2/4.7.10 flatbuffers/1.12.0-GCCcore-10.2.0 ALICE/default (L) Miniconda3/4.9.2 flex/2.6.4 AMUSE-GPU/12.0.0-foss-2018a-Python-2.7.14 MotionCor2/1.3.0-GCCcore-8.2.0 fontconfig/2.13.93-GCCcore-10.3.0 AMUSE-Miniconda2/4.7.10 MrBayes/3.2.6-foss-2017a foss/2021a AMUSE-VADER/12.0.0-foss-2018a-Python-2.7.14 MultiQC/1.7-foss-2018b-Python-3.6.6 fosscuda/2020b AMUSE/2021.7-Miniconda3-4.9.2 NASM/2.15.05-GCCcore-10.3.0 freeglut/3.2.1-GCCcore-8.3.0 ANTs/2.3.5-foss-2019b-Python-3.7.4 NCCL/2.10.3-GCCcore-10.3.0-CUDA-11.3.1 freetype/2.10.4-GCCcore-10.3.0 APR-util/1.6.1-GCCcore-8.2.0 NGS/2.10.0-GCCcore-8.2.0-Java-11 gc/7.6.12-GCCcore-9.3.0 APR/1.7.0-GCCcore-8.2.0 NLTK/3.2.4-foss-2019a-Python-3.7.2 gcccuda/2020b ARPwARP/8.0 NLopt/2.6.2-GCCcore-10.2.0 gensim/3.7.3-foss-2019a-Python-3.7.2 # etc...

SHARK


Finding specific modules

If you are searching for a specific software package or tool you can search for the full module name like this:


ALICE

To only search for the module Python itself, use


SHARK


In the above output, you can see that there are modules with the flag "(D)". This indicates that this is the default module for a software package for which modules for different versions exists.

If you want to get more information about a specific module, you can use the whatis sub-command, e.g.


ALICE


SHARK


Load modules

To load a module, use module load.

In the example below, we will use R to demonstrate how the module system works and what it does to your user environment.

Initially, R is not loaded and therefore the command Rscript is not available for use. We can test this by using the command which that looks for programs the same way that Bash does. We can use it to tell us where a particular piece of software is stored.


ALICE


SHARK


The list of paths in the brackets after “no Rscript in” comes from the environment variable $PATH.

Let us first find the a module for R which contains the Rscript command


ALICE


SHARK


Now that we have found the module, we can load it, list all loaded modules just to make sure and check whether Rscript is now available:


ALICE

While you can also just use module load R, this is not recommended because this command will always load the default version of a module and the default may change when newer versions are being added.

On ALICE, the long list of modules is completely fine because the module R depends on several other modules which have to be loaded in order for R to properly work.


SHARK


So what just happened? To understand the output, first we need to understand the nature of the $PATH environment variable. $PATH is a special environment variable that controls where a Linux operating system (OS) looks for software. Specifically $PATH is a list of directories (separated by :) that the OS searches through for a command. As with all environment variables, we can print it using echo.


ALICE

This is how $PATH looks like after loading R on ALICE


SHARK

This is how $PATH looks like after loading R on SHARK


After loading the module for R, the $PATH variable has been changed to include the directories holding executables of the newly loaded modules.

Taking this to its conclusion, module load adds software to your $PATH. It “loads” software.

Also a note of warning: When you load several modules, it is possible that their dependencies can cause conflicts and problems later on. It is best to always check what other modules have been automatically loaded.

Unload modules

The command module unload “un-loads” a module. Using the above example of R


ALICE

As you can see, the module for R was “unloaded”.


SHARK


If you want to unload all currently loaded modules, you can use the command module purge.

Saving and restoring (default) modules

If you work with a standard set of modules quite a lot, you can save and restore to the list of modules at any time.

Let’s start from a clean module environment:


ALICE

Load a set of modules and save them as default

If you need to restore them:

To test the effect of restoring them, either load another module or purge the module environment. You can check with module list that the default modules have been loaded.

You can can also define multiple different module environments by specifying a name when saving the module list:

And restoring works like this:


SHARK

Load a set of modules and save them as default

If you need to restore them:

To test the effect of restoring them, either load another module or purge the module environment. You can check with module list that the default modules have been loaded.

You can can also define multiple different module environments by specifying a name when saving the module list:

And restoring works like this:


You can get a full list of saved module environments like this:

Disabling an saved collection of modules can be done with

where <name> should be replaced by the name of the collection. This will not entirely remove the file though. This has to be done manually.

The various collections of modules are stored in your home directory in: ~/.lmod.d