Running an R script on the command line
There are several ways to launch an R script on the command line:
Rscript yourfile.R
R CMD BATCH yourfile.R
R --no-save < yourfile.R
The first approach (i.e. using the Rscript command) redirects the output into stdout. The second approach (i.e. using the R CMD BATCH command) redirects its output into a file (in case yourfile.Rout). A third approach is to redirect the input of the file yourfile.R to the R executable. Note that in the latter approach you must specify one of the following flags: --save
, --no-save
or --vanilla
. Careful with using the option --vanilla
, because it will also tell R to not read your user profile and environment.
Using R with OpenMPI
In addition to the examples for running R in parallel in Your first R job , we provide here a basic HelloWorld example for using with OpenMPI
The R script
We will use the following R script and saved in a file called test_r_mpi.R
library(Rmpi) id <- mpi.comm.rank(comm = 0) np <- mpi.comm.size(comm = 0) hostname <- mpi.get.processor.name() msg <- sprintf("Hello world from process %03d of %03d, on host %s\n", id, np, hostname) cat(msg) mpi.barrier(comm = 0) mpi.finalize()
The Slurm batch file
Now, we need a Slurm batch file to run the R script as a batch job which we call test_r_mpi.slurm
:
ALICE
#!/bin/bash #SBATCH --job-name=test_r_mpi # Job name #SBATCH --output=%x_%j.out # Output file name #SBATCH --partition=testing # Partition #SBATCH --time=00:05:00 # Time limit #SBATCH --nodes=2 # Number of nodes #SBATCH --ntasks-per-node=4 # MPI processes per node module purge module load slurm module add R/4.0.5-foss-2020b srun Rscript r_mpi.R
SHARK
#!/bin/bash #SBATCH --job-name=test_r_mpi # Job name #SBATCH --output=%x_%j.out # Output file name #SBATCH --partition=short # Partition #SBATCH --time=00:05:00 # Time limit #SBATCH --nodes=2 # Number of nodes #SBATCH --ntasks-per-node=4 # MPI processes per node module purge module load slurm module add statistical/R/4.1.2/gcc.8.3.1 module add library/mpi/openmpi/4.1.1/gcc-8.3.1 srun Rscript r_mpi.R
After running the job above, Slurm will have created a file called test_r_mpi_<job_id>.out
whose content will look something like this
[me@<login_node> ]$ cat test_r_mpi_<job_id>.out Hello world from process 000 of 008, on host res-hpc-gpu01 Hello world from process 001 of 008, on host res-hpc-gpu01 Hello world from process 002 of 008, on host res-hpc-gpu01 Hello world from process 003 of 008, on host res-hpc-gpu01 Hello world from process 004 of 008, on host res-hpc-gpu02 Hello world from process 005 of 008, on host res-hpc-gpu02 Hello world from process 006 of 008, on host res-hpc-gpu02 Hello world from process 007 of 008, on host res-hpc-gpu02
Running R interactively
You can start to run R interactively, just as an exercise and test. The recommended way is to run R in batch mode.
ALICE
[me@nodelogin02 ~]$ salloc --ntasks 1 --mem=1G --time=00:05:00 --partition=testing salloc: Pending job allocation 656147 salloc: job 656147 queued and waiting for resources salloc: job 656147 has been allocated resources salloc: Granted job allocation 656147 salloc: Waiting for resource configuration salloc: Nodes nodelogin01 are ready for job [me@nodelogin02 ~]$ ssh nodelogin01 Last login: Tue Aug 9 15:35:06 2022 from p-cfer-016105.infra.leidenuniv.nl [me@nodelogin01 ~]$ module load R/4.0.5-foss-2020b [me@nodelogin01 ~]$ R R version 4.0.5 (2021-03-31) -- "Shake and Throw" Copyright (C) 2021 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [Previously saved workspace restored] > q() Save workspace image? [y/n/c]: n [me@nodelogin01 ~]$ exit logout Connection to nodelogin01 closed. [me@nodelogin02 ~]$ exit exit salloc: Relinquishing job allocation 656147 salloc: Job allocation 656147 has been revoked.
SHARK
[me@res-hpc-lo02 ~]$ salloc --ntasks 1 --mem=1G --time=00:05:00 --partition=short salloc: Granted job allocation 11441259 salloc: Waiting for resource configuration salloc: Nodes res-hpc-gpu09 are ready for job [me@res-hpc-gpu09 ~]$ module load statistical/R/4.1.2/gcc.8.3.1 [me@res-hpc-gpu09 ~]$ R R version 4.1.2 (2021-11-01) -- "Bird Hippie" Copyright (C) 2021 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > q() Save workspace image? [y/n/c]: n [me@res-hpc-gpu09 ~]$ exit exit salloc: Relinquishing job allocation 11441259 salloc: Job allocation 11441259 has been revoked.
RStudio
RStudio is an Integrated Development Environment (IDE) for R. It includes a console, syntax highlighting editor that supports direct code execution, as well as tools for plotting, debugging, history and workspace management. For more information see RStudio webpage.
RStudio is installed on both clusters and can be invoked on a login node as follows:
ALICE
module load RStudio/1.2.5033 rstudio
Note that you also need to load a version of R.
SHARK
module load statistical/RStudio/1.3.959/gcc-8.3.1 rstudio
RStudio cannot be executed from a slurm job submitted with sbatch, but you can use it by running an interactive job.
Interactive jobs for RStudio
Interactive jobs can be submitted to queue by using the Slurm command salloc
which takes the same options as slurm batch files.
Since interactive jobs also go into the queue, it can take some time until your job runs depending on the load on the cluster. Therefore, it is best to submit the interactive job from a screen or tmux session.
Here is an example of a salloc
command
salloc --ntasks=1 --cpus-per-task=1 --mem-per-cpu=10GB --partition=<partition_name> --time=00:05:00 --x11 --pty bash
where <partition_name>
needs be replaced by a valid partition. The option --x11
is important for forwarding x11 from the compute node on which the job is running. The job uses a rather short running time because it is intended for testing how to launch RStudio.
Once your interactive is running, you can launch RStudio in the following way:
ALICE
module load R/3.6.2-fosscuda-2019b module load RStudio/1.2.5033 export XDG_RUNTIME_DIR=/tmp/runtime-<your_alice_user_name> srun rstudio
where you should replace <your_alice_user_name>
by your username on ALICE. The last step will launch RStudio on the compute node that has been assigned to you.
SHARK
module purge module add statistical/RStudio/1.3.959/gcc-8.3.1 rstudio
RStudio on the OOD (OpenOnDemand portal)
OOD is only available to SHARK users.
You can also start an RStudio server on the OOD portal