Apptainer/Singularity containers

Introduction

This document explains some basics about the Apptainer/Singularity container system, why it may be useful to you, and some tricks to work with it on the cluster.

Apptainer/Singularity is a flexible container system, which allows you to create an environment that remains stable for the software you want to use. Dependencies remain in one place, allowing you to easily reuse it on different machines and clusters, while adding little overhead.

Previously, Apptainer was known as just Singularity, and it was developed by many research facilities around the world. Later on, commercial support was added by the original authors, which helped continue development of Singularity. In 2021, the commercial and open source projects were split up, and the open source version of Singularity was renamed to Apptainer. This caused some differences and similarities between the two versions:

  • Apptainer versions restarted at 1.0, while the old, open source Singularity version remained at 3.6; the commercial version continued from 3.6 onwards.

  • There is separate, official documentation for Apptainer and Singularity CE; they are similar but refer to the appropriate documentation when the documentation here is not exhaustive.

  • Apptainer and Singularity use the same kinds of files to store image builds known as SIF. Also, both can build an image from a Singularity definition file as well as other sources, like Docker. We discuss the images and build process in this document.

  • Environment variables for Apptainer start with APPTAINER_... versions with SINGULARITY_ are deprecated.

On the clusters, both options are available.

ALICE


The command line programs singularity and apptainer work the same. You no longer need to load a module, as Apptainer is installed globally.

For legacy purposes, the modules for Singularity still exists, but we recommend to use Apptainer.

SHARK


Modules exist for both Apptainer and Singularity on SHARK.

You can find them by running module avail apptainer or module avail singularity.


Unless otherwise mentioned, we will refer to Apptainer for both Apptainer and Singularity.

Preparations

It is relevant to consider what you plan to do with an Apptainer container before you start. Using a container can help you with running a software package that is not installed globally on the clusters, but it does require you to find or build an image. Often, some kind of image is already available, although it may not be in the format that you expect.

Temporary files

When building, pulling or running an image, Apptainer needs to create temporary files and cache files. It is important to know that

  • the temporary files are usually stored in /tmp, but this space can quickly fill. While this is not an issue on SHARK, it is on ALICE especially on the login nodes. On SHARK, it is also recommended to change the default location, because /tmp is used by all users.

  • the cache files are stored in a “hidden” subdirectory of your home directory by default.. The quota on your home on ALICE or SHARK can quickly fill up this way.

Therefore, it is a good idea to adjust the paths where Apptainer stores these files. For a SLURM job, you can change these paths by adding these lines before you use Apptainer:


ALICE

export APPTAINER_TMPDIR=$SCRATCH/.apptainer-tmp export APPTAINER_CACHEDIR=$SCRATCH/.apptainer-cache mkdir -p $APPTAINER_TMPDIR mkdir -p $APPTAINER_CACHEDIR

The last two lines ensure that the directories exist.

The $SCRATCH environment variable is set on ALICE by default to point to a directory for your SLURM job on the node’s /scratchdata. All the files and directories below it are deleted afterwards, which helps conserve space.

 


SHARK

export APPTAINER_TMPDIR=/tmp/$USER/.apptainer-tmp export APPTAINER_CACHEDIR=/tmp/$USER/.apptainer-cache mkdir -p $APPTAINER_TMPDIR mkdir -p $APPTAINER_CACHEDIR chown 700 $APPTAINER_TMPDIR chown 700 $APPTAINER_CACHEDIR

The last four lines ensure that the directories exist and that they are only accessible by you.


As a reminder if you use Singularity, the environment variables for Singularity start with SINGULARITY_ instead of APPTAINER_.

You can also user Apptainer outside of a SLURM script, but we recommend to limit this to inspecting the container. Creating containers is best done in a SLURM job (batch or interactive).

Pulling images

Apptainer is able to retrieve and use images from several container registry websites, including Singularity Hub (archive) and Docker Hub. For Docker images, an automatic conversion process turns the Docker image into an Apptainer-compatible image. Sometimes, a SIF image is already available, either online or installed on the cluster.


ALICE

On ALICE, several SIF images are available in /cm/shared/singularity_images/, among others

  • fMRIPrep

  • funannotate (this may be an alternative to installing it yourself)

  • BUSCO (from a Docker image)

  • Trinity RNA Seq

Note that images could be outdated. If you have any questions regarding installed Singularity images or you need a newer version, please contact the ALICE Helpdesk.

You can also bring or create your own images.


SHARK

There are currently no SIF images available on SHARK. You will have to bring your own images or create one.

If your software package distributor provides a SIF file, you can download it with apptainer pull https://example.com/image.sif. Often, however, the distributor refers to Docker for HPC systems. These can be converted to SIF files in the same way, for example apptainer pull docker://sylabsio/lolcow. The SIF file is then stored in the file lolcow_latest.sif. Then, you can run the entrypoint program of the container with apptainer run lolcow_latest.sif, in this case to receive a colorful cow telling you the current time.

Building images

When you have a code repository with a file named for example image.def, you can build the source code into a container format using this file. Run apptainer build image.sif image.def to start the build.

Some code requires being run as a root user within the container, for example to adjust process limits and CPU/GPU allocations, or to use specific ports or network functions within the container. In order to build such a container while still keeping the container system secure on a multi-user cluster, a “fake root” feature is available in Apptainer, which is enabled using an additional flag: apptainer build --fakeroot image.sif image.def. This argument can also be passed when running the container.

We do not go into detail about writing your own definition files or altering existing definitions in this section. If a code repository has a Dockerfile but not a image.def file, there are methods to convert the former to the latter file, thus skipping a Docker build. The Python module spython allows recipe conversion. After installing the module spython (best in a virtual environment), you can run spython recipe Dockerfile > image.def. You should double-check the image.def file for correct functionality, as the translation process is not perfect. A particular problem may be the files being copies into the container, as these are separated from the other steps. When multiple files or directories are copied, there may be duplicate entries, conflicting situations or incorrect destinations in the %files section of the Singularity recipe definition.

Again, building a container image is usually not necessary if a pre-built Docker image or SIF file is already available. It does give you control over the definition file, which allows you to make changes to the setup of the image.

Running containers

As mentioned in the Building images section, you can start an existing SIF file using apptainer run image.sif.

You can skip the step of pulling an image by using a remote image URL, which also works with Docker images, for example apptainer run docker://sylabsio/lolcow. Note that the SIF file is then not stored in a permanent path, so you would be performing the conversion step every time you want to use the image, unless the layers are still stored in your cache. Since the cache may be deleted at any point, this is not recommended.

By default, running a container invokes a “runscript” in the image, which determines what action should be taken. It is similar to the entrypoint that a Docker image defines. For the lolcow example, this means running a few commands that obtain the date, format it in a lolcow and adding a color pattern over it. Other commands in runscripts may allow additional arguments, which you can pass to them by adding them at the end of the apptainer run image.sif command.

If you instead use apptainer exec image.sif, you can start a command within the container. You must provide a command at the end. Example: apptainer exec lolcow_latest.sif cowthink Test

By using apptainer shell image.sif, you can open a shell in the container. This allows finding commands provided by the image interactively.

If the runscript command or the command you provide to apptainer exec expects input through a pipeline or open file, you can pipe the input like this: who | apptainer exec lolcow_latest.sif lolcat. Similarly, output from the command can be redirected to a file or another command on the global installation. However, usually a command can also find files by providing their filename or path as an argument. Note that such a file must be in a path that is bind-mounted to the container, more on this in ‘I can’t find my files/directories!' below.

NVIDIA GPU support

If you add an additional argument --nv between the apptainer subcommand and the image file, such as apptainer run --nv image.sif or apptainer exec --nv image.sif, the container can make use of the available GPUs on the system. Most images that support running on GPUs also understand the environment variable CUDA_VISIBLE_DEVICES, but if it is not having an effect, then see the next section.

My environment variables aren’t being passed to the program

Apptainer tries to provide a clear environment to the commands within the container. However, Docker-based images indicate which environment variables are passed through, and Apptainer/Singularity respects those instructions. An Apptainer/Singularity definition file can define additional environment variables in a %environment section, but this may be tricky.

In order to pass an environment variable not defined in this way, you can prepend the variable name with APPTAINERENV_ before passing it. For example, APPTAINERENV_CUDA_VISIBLE_DEVICES=0 apptainer run --nv gpu_image.sif will ensure that the CUDA_VISIBLE_DEVICES environment variable is set for the container.

There are other solutions like passing environment variables with --env.

You can enforce that all environment variables outside the container are ignored in the container by using the option --cleanenv

More details on Apptainer’s environment are found in https://apptainer.org/docs/user/1.1/environment_and_metadata.html#environment-variable-precedence

It can’t find my files/directories!

Because the container image contains its own virtual filesystem, not all directories from the system are made available within the container. In order to add specific paths to the virtual filesystem, Apptainer uses a mapping known as bind mounts. By default, the user’s home directory and the current working directory when the apptainer command is run are made available. Thus you can usually specify files in these paths to read from or write to without much problem.

If you want to access files in a different path, such as on /data1 or /scratchdata/, you need to specify the path as a mount with a --bind argument between the Apptainer subcommand and the image file. For example, apptainer exec --bind /data1/$USER image.sif ls /data1/$USER will show the same thing as ls /data1/$USER does.

The bind mount allows renaming the path to something else within the container. This is useful when it is a long path or if the image already has a virtual filesystem path with the same name, for example. apptainer exec --bind /data1/$USER:/mnt image.sif ls /mnt accomplishes the same as the above. Additional paths to mount can be given by separating them with commas, or by adding the --bind argument multiple times.

Changes to the container don’t persist

Except for the bind mounts, the virtual filesystem within the container is read-only by default. This makes the container more safe and simplifies the storage within the image file. If you want to change how the container works, then it may be worth looking into building an image on top of it using a definition file, but we will not go into detail on writing a image.def based on another image.

Another option is to add an overlay directory. Simply create a directory in a storage location on the cluster with mkdir image.overlay. Then pass this directory name to the Apptainer subcommand with a --overlay argument, for example apptainer exec --overlay image.overlay image.sif touch /var/log/output.log. Now, any changes to files outside of bind mounts are stored in the overlay directory in a structured format. It is not recommended to change this from outside the container.

Building and running in SLURM

You can use Apptainer from a SLURM script to perform commands not globally installed on the machine. For building, we recommend that you use the environment variables for the temporary/cache files and that you also write the SIF file to local scratch, so that there is less network filesystem overhead. You can then copy the SIF file to a permanent storage afterwards. You could also copy it back from there in a later job run by checking if it is available:


ALICE

export APPTAINER_TMPDIR=$SCRATCH/.apptainer-tmp export APPTAINER_CACHEDIR=$SCRATCH/.apptainer-cache mkdir -p $APPTAINER_TMPDIR mkdir -p $APPTAINER_CACHEDIR if [ -f "/data1/$USER/image.sif" ]; then cp "/data1/$USER/image.sif" "$SCRATCH/image.sif" else apptainer build "$SCRATCH/image.sif" docker://sylabsio/lolcow cp "$SCRATCH/image.sif" "/data1/$USER/image.sif" fi apptainer run "$SCRATCH/image.sif"

The example here assumes that the container image is in your directory on /data1, but you should adjust it as needed.

 


SHARK

The example here assumes that the container image is in your directory on $HOME directory or will be stored there. However, because of the limited space, it is recommended to store the image in /export share that is available to you.

Note that the variable $IMAGEDIR is not a default Apptainer/Singularity variable. It is used here solely for simplifying in the script where to create the image.


For GPU tasks, remember to use the --nv argument to apptainer run or apptainer exec. SLURM should take the appropriate steps to set the environment variables for the allocated GPU resources, but double check that your task is using the correct GPU. You can always add APPTAINERENV_CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES in front of the command that runs the apptainer image.