Your first MPI job

About this tutorial

This tutorial will guide you through running a very simple Hello-World-type job with OpenMPI.

What you will learn?

  • Setting up the batch script for an OpenMPI job

  • Loading the necessary modules

  • Submitting your job

  • Monitoring your job

  • Collect information about your job

What this example will not cover?

  • Using other MPI compilers

  • Writing code for MPI

  • Optimizing your MPI code

What you should know before starting?

OpenMPI on ALICE and SHARK

There are various versions of OpenMPI available on ALICE and SHARK. You can get an overview by running the following command


ALICE

module -r avail ^OpenMPI

Various modules on ALICE have been built with OpenMPI. When you load these modules, the version of OpenMPI that was used to built the module will be loaded automatically.


SHARK

module avail /mpi/

For this tutorial, we will be using OpenMPI 4.1.1.

Preparations

Log in to ALICE if you have not done it yet.

Before you set up your job or submit it, it is always best to have a look at the current job load on the cluster and what partitions are available to you.

Also, it helps to run some short, resource-friendly tests to see if your set up is working and you have a correct batch file. The “testing”-partition on ALICE or the “short” partition on SHARK can be used for such purpose. The examples in this tutorial are save to use on those partitions.

Here, we will assume that you have already created a directory called user_guide_tutorials in your $HOME from the previous tutorials. For this job, let's create a sub-directory and change into it:

mkdir -p $HOME/user_guide_tutorials/first_MPI_job cd $HOME/user_guide_tutorials/first_MPI_job

We will first create the MPI program and then write the slurm batch file.

MPI program

This is a very basic Hello-World type of MPI program. It will print out information about the rank and node that it is running on. We will name this file helloworld_mpi.c

Next, we load a version of OpenMPI and then we use mpicc to compile our program:


ALICE


SHARK


Slurm batch file

The slurm batch script helloworld_mpi.slurm for our MPI example program looks like this:


ALICE


SHARK


where you should replace <your-email-address> by your e-mail address. Here, we have requested two nodes to run 10 tasks. The tasks will be distributed automatically over the two nodes.

The output from our MPI program will go into the Slurm output file. This is fine for the example here, but not the best approach because the processes running in parallel have to write to the same file.

The resources set in the batch script have been determined after running the job at least once with more conservative estimates. In this configuration, it is fine to run the job on the testing partition.

Job submission

Let us submit this MPI job to slurm:

Immediately after you have submitted this job, you should see something like this:

Job output

In the directory where you launched your job, there should be new file created by Slurm: test_openmpi_<jobid>.out. It contains all the output from your job which would have normally written to the command line. Check the file for any possible error messages. The content of the file should look something like this:

Because this is a parallel job, the output from each process is out of order.

You can get a quick overview of the resources actually used by your job by running:

It might look something like this:

Cancelling your job

If you need to cancel your job, you can do so with: