Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

This section provide additional information on different aspects of using R on ALICE and SHARK.

For a basic example of submitting an R job on both cluster, please see Your first R job

For background information on installing R packages yourself, please see Installing R packages

For further reading, check out some of the references here: https://pubappslu.atlassian.net/wiki/spaces/HPCWIKI/pages/37749488/References+and+further+reading#R

Running an R script on the command line

There are several ways to launch an R script on the command line:

  1. Rscript yourfile.R

  2. R CMD BATCH yourfile.R

  3. R --no-save < yourfile.R

The first approach (i.e. using the Rscript command) redirects the output into stdout. The second approach (i.e. using the R CMD BATCH command) redirects its output into a file (in case yourfile.Rout). A third approach is to redirect the input of the file yourfile.R to the R executable. Note that in the latter approach you must specify one of the following flags: --save, --no-save or --vanilla. Careful with using the option --vanilla, because it will also tell R to not read your user profile and environment.

Using R with OpenMPI

In addition to the examples for running R in parallel in Your first R job , we provide here a basic HelloWorld example for using with OpenMPI

The R script

We will use the following R script and saved in a file called test_r_mpi.R

library(Rmpi)

id <- mpi.comm.rank(comm = 0)
np <- mpi.comm.size(comm = 0)
hostname <- mpi.get.processor.name()

msg <- sprintf("Hello world from process %03d of %03d, on host %s\n", id, np, hostname)
cat(msg)

mpi.barrier(comm = 0)
mpi.finalize()

The Slurm batch file

Now, we need a Slurm batch file to run the R script as a batch job which we call test_r_mpi.slurm:


ALICE

#!/bin/bash

#SBATCH --job-name=test_r_mpi       # Job name
#SBATCH --output=%x_%j.out         # Output file name
#SBATCH --partition=testing               # Partition
#SBATCH --time=00:05:00                 # Time limit
#SBATCH --nodes=2                       # Number of nodes
#SBATCH --ntasks-per-node=4             # MPI processes per node

module purge
module load slurm
module add R/4.0.5-foss-2020b

srun Rscript test_r_mpi.R

SHARK

#!/bin/bash

#SBATCH --job-name=test_r_mpi       # Job name
#SBATCH --output=%x_%j.out         # Output file name
#SBATCH --partition=short               # Partition
#SBATCH --time=00:05:00                 # Time limit
#SBATCH --nodes=2                       # Number of nodes
#SBATCH --ntasks-per-node=4             # MPI processes per node

module purge
module load slurm
module add statistical/R/4.1.2/gcc.8.3.1
module add library/mpi/openmpi/4.1.1/gcc-8.3.1

srun Rscript test_r_mpi.R

After running the job above, Slurm will have created a file called test_r_mpi_<job_id>.out whose content will look something like this

[me@<login_node> ]$ cat test_r_mpi_<job_id>.out 
Hello world from process 000 of 008, on host res-hpc-gpu01
Hello world from process 001 of 008, on host res-hpc-gpu01
Hello world from process 002 of 008, on host res-hpc-gpu01
Hello world from process 003 of 008, on host res-hpc-gpu01
Hello world from process 004 of 008, on host res-hpc-gpu02
Hello world from process 005 of 008, on host res-hpc-gpu02
Hello world from process 006 of 008, on host res-hpc-gpu02
Hello world from process 007 of 008, on host res-hpc-gpu02

Running R interactively

You can start to run R interactively, just as an exercise and test. The recommended way is to run R in batch mode.


ALICE

[me@nodelogin02 ~]$ salloc --ntasks 1 --mem=1G --time=00:05:00 --partition=testing
salloc: Pending job allocation 656147
salloc: job 656147 queued and waiting for resources
salloc: job 656147 has been allocated resources
salloc: Granted job allocation 656147
salloc: Waiting for resource configuration
salloc: Nodes nodelogin01 are ready for job
[me@nodelogin02 ~]$ ssh nodelogin01
Last login: Tue Aug  9 15:35:06 2022 from p-cfer-016105.infra.leidenuniv.nl
[me@nodelogin01 ~]$ module load R/4.0.5-foss-2020b
[me@nodelogin01 ~]$ R

R version 4.0.5 (2021-03-31) -- "Shake and Throw"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

> q()
Save workspace image? [y/n/c]: n
[me@nodelogin01 ~]$ exit
logout
Connection to nodelogin01 closed.
[me@nodelogin02 ~]$ exit
exit
salloc: Relinquishing job allocation 656147
salloc: Job allocation 656147 has been revoked.

SHARK

[me@res-hpc-lo02 ~]$ salloc --ntasks 1 --mem=1G --time=00:05:00 --partition=short
salloc: Granted job allocation 11441259
salloc: Waiting for resource configuration
salloc: Nodes res-hpc-gpu09 are ready for job
[me@res-hpc-gpu09 ~]$ module load statistical/R/4.1.2/gcc.8.3.1
[me@res-hpc-gpu09 ~]$ R

R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> q()
Save workspace image? [y/n/c]: n
[me@res-hpc-gpu09 ~]$ exit
exit
salloc: Relinquishing job allocation 11441259
salloc: Job allocation 11441259 has been revoked.

RStudio

RStudio is an Integrated Development Environment (IDE) for R. It includes a console, syntax highlighting editor that supports direct code execution, as well as tools for plotting, debugging, history and workspace management. For more information see RStudio webpage.

RStudio is installed on both clusters and can be invoked on a login node as follows:


ALICE

 module load RStudio/1.2.5033
 rstudio

Note that you also need to load a version of R.


SHARK

module load statistical/RStudio/1.3.959/gcc-8.3.1
rstudio

RStudio cannot be executed from a slurm job submitted with sbatch, but you can use it by running an interactive job.

Interactive jobs for RStudio

Interactive jobs can be submitted to queue by using the Slurm command salloc which takes the same options as slurm batch files.

Since interactive jobs also go into the queue, it can take some time until your job runs depending on the load on the cluster. Therefore, it is best to submit the interactive job from a screen or tmux session.

Here is an example of a salloc command

  salloc --ntasks=1 --cpus-per-task=1 --mem-per-cpu=10GB --partition=<partition_name> --time=00:05:00 --x11 --pty bash

where <partition_name> needs be replaced by a valid partition. The option --x11 is important for forwarding x11 from the compute node on which the job is running. The job uses a rather short running time because it is intended for testing how to launch RStudio.

Once your interactive is running, you can launch RStudio in the following way:


ALICE

module load R/4.2.1-foss-2022a
module load RStudio/1.2.5033
export XDG_RUNTIME_DIR=/tmp/runtime-<your_alice_user_name>
srun rstudio

where you should replace <your_alice_user_name> by your username on ALICE. The last step will launch RStudio on the compute node that has been assigned to you.


SHARK

module purge
module add statistical/RStudio/1.3.959/gcc-8.3.1
rstudio

RStudio on the OOD (OpenOnDemand portal)

OOD is only available to SHARK users.

You can also start an RStudio server on the OOD portal

  • No labels