Using available software - Environment Modules
This section explains how to make use of the system-wide available software stack which is organized in modules.
A module is a self-contained description of a software package - it contains the settings required to run a software package and, usually, encodes required dependencies on other software packages.
There are a number of different environment module implementations commonly used on HPC systems: the two most common are TCL modules and Lmod. Both of these use similar syntax and the concepts are the same so learning to use one will allow you to use whichever is installed on the system you are using.
Both ALICE and SHARK, use Lmod.
The module system makes it easy to switch between different software for different jobs and access multiple versions of the same software.
Basic usage
This table provides a short overview of the most commonly used options for the command module
. Check the help for module to see all commands with a description.
Command | Description |
---|---|
| Get help message |
| Get a list of all currently loaded modules |
| Get list of all available modules (See below for details) |
| Get list of all available modules which contain |
| Get list of all modules matching |
| Get the description of the given |
| Load the specified |
| Unload the module |
| Reset module environment to default list of modules (See below for details) |
| Unload all loaded modules (See below for details) |
| Save the current list of modules, e.g., as a default environment |
| Restore module environment to a default list of modules |
The module structure on ALICE and SHARK
ALICE and SHARK have different module structures. This is partly related to how the system-wide software stack is being managed on both clusters. However, the basic usage of environment modules is the same on both clusters.
Once you are familiar with how modules are organized on the cluster that you are using, it will not be difficult for you to find the modules that you are looking for.
ALICE
On ALICE modules are organized as different module trees. The software behind each module is build and maintained using EasyBuild (https://easybuild.io/).
In each tree, the modules are organized as <name>/<version>
where <version>
can also include additional information about what was used to build the module <name>
The default trees are
Tree | Description |
---|---|
| Default tree, containing all of the system-wide software build with EasyBuild for all nodes with Intel CPU. Available by default or after loading |
| Containing the modules for Slurm and LUMC/FSL software |
Additionally available trees that need to be loaded manually are:
Tree | Description |
---|---|
| Containing all LUMC-provided modules including FSL. Available after loading the module |
and | Old modules for Intel and AMD nodes prior to maintenance of May 2024. Because they were build on the old operating system they may or may not work. You can access them by loading the ALICE/legacy module. The module tree is chosen based on the CPU type. |
SHARK
On SHARK, all software is contained in the tree /share/modulefiles
Modules are being organized in a hierarchical structure, i.e., <moduleclass>/<name>/<version>
where <moduleclass>
is the class to which the module belong to, <name>
the name of the module, and <version>
the installed version of the module <name>
The different module classes currently are:
benchmark
bioimage
bioinformatics
container
cryogenicEM
genomics
graphics
gwas
library
mathematical
medicalImaging
neuroImaging
pharmaceutical
statistical
system
tools
List currently loaded modules
On login, you might start out with a default set of modules.
The module list
command shows which modules you currently have loaded in your environment.
ALICE
This is what you will most likely see after logging into ALICE.
[me@nodelogin02~]$ module list
Currently Loaded Modules:
1) shared 2) DefaultModules 3) gcc/8.2.0 4) slurm/19.05.1
You can see that by default the module for Slurm and the gcc compiler is loaded. The number behind the slash sign represent the version number.
SHARK
On SHARK, no modules are being loaded by default. After login, you would see
[me@res-hpc-lo02 ~]$ module list
No modules loaded
List available modules
To see the available modules, use module -d avail
. With the -d
option, you will only get the default versions of the modules. For various software packages, there are also older/other versions available, that might be used if necessary. You can see all version by omitting the -d
option.
ALICE
[me@nodelogin02~]$ module -d avail
---------------------------------------------------------------------- /cm/shared/easybuild/GenuineIntel/modules/all ----------------------------------------------------------------------
4ti2/1.6.9-GCC-8.2.0-2.31.1 Miniconda2/4.7.10 flatbuffers/1.12.0-GCCcore-10.2.0
ALICE/default (L) Miniconda3/4.9.2 flex/2.6.4
AMUSE-GPU/12.0.0-foss-2018a-Python-2.7.14 MotionCor2/1.3.0-GCCcore-8.2.0 fontconfig/2.13.93-GCCcore-10.3.0
AMUSE-Miniconda2/4.7.10 MrBayes/3.2.6-foss-2017a foss/2021a
AMUSE-VADER/12.0.0-foss-2018a-Python-2.7.14 MultiQC/1.7-foss-2018b-Python-3.6.6 fosscuda/2020b
AMUSE/2021.7-Miniconda3-4.9.2 NASM/2.15.05-GCCcore-10.3.0 freeglut/3.2.1-GCCcore-8.3.0
ANTs/2.3.5-foss-2019b-Python-3.7.4 NCCL/2.10.3-GCCcore-10.3.0-CUDA-11.3.1 freetype/2.10.4-GCCcore-10.3.0
APR-util/1.6.1-GCCcore-8.2.0 NGS/2.10.0-GCCcore-8.2.0-Java-11 gc/7.6.12-GCCcore-9.3.0
APR/1.7.0-GCCcore-8.2.0 NLTK/3.2.4-foss-2019a-Python-3.7.2 gcccuda/2020b
ARPwARP/8.0 NLopt/2.6.2-GCCcore-10.2.0 gensim/3.7.3-foss-2019a-Python-3.7.2
# etc...
SHARK
Finding specific modules
If you are searching for a specific software package or tool you can search for the full module name like this:
ALICE
To only search for the module Python itself, use
SHARK
In the above output, you can see that there are modules with the flag "(D)". This indicates that this is the default module for a software package for which modules for different versions exists.
If you want to get more information about a specific module, you can use the whatis
sub-command, e.g.
ALICE
SHARK
Load modules
To load a module, use module load
.
In the example below, we will use R to demonstrate how the module system works and what it does to your user environment.
Initially, R is not loaded and therefore the command Rscript
is not available for use. We can test this by using the command which
that looks for programs the same way that Bash does. We can use it to tell us where a particular piece of software is stored.
ALICE
SHARK
The list of paths in the brackets after “no Rscript in” comes from the environment variable $PATH
.
Let us first find the a module for R
which contains the Rscript
command
ALICE
SHARK
Now that we have found the module, we can load it, list all loaded modules just to make sure and check whether Rscript
is now available:
ALICE
While you can also just use module load R
, this is not recommended because this command will always load the default version of a module and the default may change when newer versions are being added.
On ALICE, the long list of modules is completely fine because the module R depends on several other modules which have to be loaded in order for R to properly work.
SHARK
So what just happened? To understand the output, first we need to understand the nature of the $PATH
environment variable. $PATH
is a special environment variable that controls where a Linux operating system (OS) looks for software. Specifically $PATH
is a list of directories (separated by :
) that the OS searches through for a command. As with all environment variables, we can print it using echo
.
ALICE
This is how $PATH
looks like after loading R on ALICE
SHARK
This is how $PATH
looks like after loading R on SHARK
After loading the module for R, the $PATH
variable has been changed to include the directories holding executables of the newly loaded modules.
Taking this to its conclusion, module load
adds software to your $PATH
. It “loads” software.
Also a note of warning: When you load several modules, it is possible that their dependencies can cause conflicts and problems later on. It is best to always check what other modules have been automatically loaded.
Unload modules
The command module unload
“un-loads” a module. Using the above example of R
ALICE
As you can see, the module for R was “unloaded”.
SHARK
If you want to unload all currently loaded modules, you can use the command module purge
.
Saving and restoring (default) modules
If you work with a standard set of modules quite a lot, you can save and restore to the list of modules at any time.
Let’s start from a clean module environment:
ALICE
Load a set of modules and save them as default
If you need to restore them:
To test the effect of restoring them, either load another module or purge the module environment. You can check with module list
that the default modules have been loaded.
You can can also define multiple different module environments by specifying a name when saving the module list:
And restoring works like this:
SHARK
Load a set of modules and save them as default
If you need to restore them:
To test the effect of restoring them, either load another module or purge the module environment. You can check with module list
that the default modules have been loaded.
You can can also define multiple different module environments by specifying a name when saving the module list:
And restoring works like this:
You can get a full list of saved module environments like this:
Disabling an saved collection of modules can be done with
where <name>
should be replaced by the name of the collection. This will not entirely remove the file though. This has to be done manually.
The various collections of modules are stored in your home directory in: ~/.lmod.d