Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This section provide additional information on different aspects of using R on ALICE and SHARK.

For a basic example of submitting an R job on both cluster, please see Your first R job

For background information on installing R packages yourself, please see Installing R packages

For further reading, check out some of the references here: https://pubappslu.atlassian.net/wiki/spaces/HPCWIKI/pages/37749488/References+and+further+reading#R

Table of Contents
minLevel1
maxLevel7
excludeContents

Running an R script on the command line

There are several ways to launch an R script on the command line:

  1. Rscript yourfile.R

  2. R CMD BATCH yourfile.R

  3. R --no-save < yourfile.R

The first approach (i.e. using the Rscript command) redirects the output into stdout. The second approach (i.e. using the R CMD BATCH command) redirects its output into a file (in case yourfile.Rout). A third approach is to redirect the input of the file yourfile.R to the R executable. Note that in the latter approach you must specify one of the following flags: --save, --no-save or --vanilla. Careful with using the option --vanilla, because it will also tell R to not read your user profile and environment.

Using R with OpenMPI

In addition to the examples for running R in parallel in Your first R job , we provide here a basic HelloWorld example for using with OpenMPI

The R script

We will use the following R script and saved in a file called test_r_mpi.R

Code Block
languager
library(Rmpi)

id <- mpi.comm.rank(comm = 0)
np <- mpi.comm.size(comm = 0)
hostname <- mpi.get.processor.name()

msg <- sprintf("Hello world from process %03d of %03d, on host %s\n", id, np, hostname)
cat(msg)

mpi.barrier(comm = 0)
mpi.finalize()

The Slurm batch file

Now, we need a Slurm batch file to run the R script as a batch job which we call test_r_mpi.slurm:


ALICE

Code Block
#!/bin/bash

#SBATCH --job-name=test_r_mpi       # Job name
#SBATCH --output=%x_%j.out         # Output file name
#SBATCH --partition=testing               # Partition
#SBATCH --time=00:05:00                 # Time limit
#SBATCH --nodes=2                       # Number of nodes
#SBATCH --ntasks-per-node=4             # MPI processes per node

module purge
module load slurm
module add R/4.0.5-foss-2020b

srun Rscript test_r_mpi.R

SHARK

Code Block
#!/bin/bash

#SBATCH --job-name=test_r_mpi       # Job name
#SBATCH --output=%x_%j.out         # Output file name
#SBATCH --partition=short               # Partition
#SBATCH --time=00:05:00                 # Time limit
#SBATCH --nodes=2                       # Number of nodes
#SBATCH --ntasks-per-node=4             # MPI processes per node

module purge
module load slurm
module add statistical/R/4.1.2/gcc.8.3.1
module add library/mpi/openmpi/4.1.1/gcc-8.3.1

srun Rscript test_r_mpi.R

After running the job above, Slurm will have created a file called test_r_mpi_<job_id>.out whose content will look something like this

Code Block
[me@<login_node> ]$ cat test_r_mpi_<job_id>.out 
Hello world from process 000 of 008, on host res-hpc-gpu01
Hello world from process 001 of 008, on host res-hpc-gpu01
Hello world from process 002 of 008, on host res-hpc-gpu01
Hello world from process 003 of 008, on host res-hpc-gpu01
Hello world from process 004 of 008, on host res-hpc-gpu02
Hello world from process 005 of 008, on host res-hpc-gpu02
Hello world from process 006 of 008, on host res-hpc-gpu02
Hello world from process 007 of 008, on host res-hpc-gpu02

Running R interactively

Note

You can start to run R interactively, just as an exercise and test. The recommended way is to run R in batch mode.


ALICE

Code Block
[me@nodelogin02 ~]$ salloc --ntasks 1 --mem=1G --time=00:05:00 --partition=testing
salloc: Pending job allocation 656147
salloc: job 656147 queued and waiting for resources
salloc: job 656147 has been allocated resources
salloc: Granted job allocation 656147
salloc: Waiting for resource configuration
salloc: Nodes nodelogin01 are ready for job
[me@nodelogin02 ~]$ ssh nodelogin01
Last login: Tue Aug  9 15:35:06 2022 from p-cfer-016105.infra.leidenuniv.nl
[me@nodelogin01 ~]$ module load R/4.0.5-foss-2020b
[me@nodelogin01 ~]$ R

R version 4.0.5 (2021-03-31) -- "Shake and Throw"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

> q()
Save workspace image? [y/n/c]: n
[me@nodelogin01 ~]$ exit
logout
Connection to nodelogin01 closed.
[me@nodelogin02 ~]$ exit
exit
salloc: Relinquishing job allocation 656147
salloc: Job allocation 656147 has been revoked.

SHARK

Code Block
[me@res-hpc-lo02 ~]$ salloc --ntasks 1 --mem=1G --time=00:05:00 --partition=short
salloc: Granted job allocation 11441259
salloc: Waiting for resource configuration
salloc: Nodes res-hpc-gpu09 are ready for job
[me@res-hpc-gpu09 ~]$ module load statistical/R/4.1.2/gcc.8.3.1
[me@res-hpc-gpu09 ~]$ R

R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> q()
Save workspace image? [y/n/c]: n
[me@res-hpc-gpu09 ~]$ exit
exit
salloc: Relinquishing job allocation 11441259
salloc: Job allocation 11441259 has been revoked.

RStudio

RStudio is an Integrated Development Environment (IDE) for R. It includes a console, syntax highlighting editor that supports direct code execution, as well as tools for plotting, debugging, history and workspace management. For more information see RStudio webpage.

RStudio is installed on both clusters and can be invoked on a login node as follows:


ALICE

Code Block
 module load R/4.4.0-gfbf-2023a
 module load RStudio-Desktop/12024.04.2.5033+764
 rstudio

Note that you also need to load a version of R. Here, we are loading R/4.4.0


SHARK

Code Block
module load statistical/RStudio/1.3.959/gcc-8.3.1
rstudio

Note that you also need to load a version of R


RStudio cannot be executed from a slurm job submitted with sbatch, but you can use it by running an interactive job.

Interactive jobs for RStudio

Interactive jobs can be submitted to queue by using the Slurm command salloc which takes the same options as slurm batch files.

Since interactive jobs also go into the queue, it can take some time until your job runs depending on the load on the cluster. Therefore, it is best to submit the interactive job from a screen or tmux session.

Here is an example of a salloc command

Code Block
  salloc --ntasks=1 --cpus-per-task=1 --mem-per-cpu=10GB --partition=<partition_name> --time=00:05:00 --x11 --pty bash

where <partition_name> needs be replaced by a valid partition. The option --x11 is important for forwarding x11 from the compute node on which the job is running. The job uses a rather short running time because it is intended for testing how to launch RStudio.

Note that the above command will already log you into the compute node that was assigned to you.

Once your interactive is running, you can launch RStudio in the following way:


ALICE

Code Block
module load R/4.24.10-fossgfbf-2022a2023a
module load RStudio-Desktop/12024.04.2.5033+764
export XDG_RUNTIME_DIR=/tmp/runtime-<your_alice_user_name>
srun rstudio
where you should replace <your_alice_user_name> by your username on ALICE. The last step will launch RStudio on the compute node that has been assigned to you.
$USER
rstudio

Please note that you can safely ignore the following error message when starting rstudio: Failed to connect to the bus:


SHARK

Code Block
module purge
module add statistical/RStudio/1.3.959/gcc-8.3.1
rstudio

RStudio on the OOD (OpenOnDemand portal)

Note

OOD is only available to SHARK users.

You can also start an RStudio server on the OOD portal