Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

It is possible to run Jupyter notebooks (in a Jupyter server session) interactively with a Slurm batch job, i.e., not using an interactive job.

If you want to access the Jupyter notebook server web interface from your local workstation, you must make sure that you have a tunnel available to the port on which the web interface serves the Jupyter notebook server output. This can be achieved with a few simple settings in the batch job and the ssh config on your local workstation.

Table of Contents

Setting up the batch job

First, we need to set up the batch job. Login to ALICE or SHARK and create a slurm batch file like this one depending on the cluster that you work on:


ALICE

In this example, we will make use of an the existing JupyterLab module on ALICE. This module will start a Jupyter server in which you can run notebooks.

If you need a different version or additional packages, you can always create your own virtual environment and install everything in there.

The sbatch settings used here were chosen for demonstration purposes only.

Code Block
 #!/bin/bash


 #SBATCH --job-name=jupyter_notebook

#SBATCH --mem=8G4G
 #SBATCH --ntasks=1
 #SBATCH --cpus-per-task=1
 #SBATCH --partition=cpu-short
 #SBATCH --time=01:00:00
 #SBATCH --output=%x_%j.out
 

unset XDG_RUNTIME_DIR
 module load IPython/7.7.0-foss-2019a-Python-3.7.2
 ALICE/default
module load JupyterLab

echo "Running the notebook on $(hostname)"
 IPADDR=$(hostname -i)

port=8989

SALT="$(head /dev/urandom | tr -dc 'A-Z-a-z-0-9{}[]=+.:-_' | head -c 16;echo;)"
"#### Starting notebook"
 srun jupyter notebook --ip=$IPADDR --no-browser --port=8989
 password="$(head /dev/urandom | tr -dc 'A-Z-a-z-0-9{}[]=+.:-_' | head -c 16;echo;)"
PASSWORD_SHA="$(echo -n "${password}${SALT}" | openssl dgst -sha256 | awk '{print $NF}')"
echo "-------------------------"
echo "Log in using the password: $password"
echo "-------------------------"

# the jupyter server config file
export CONFIG_FILE="${PWD}/config.py"

(
umask 077
cat > "${CONFIG_FILE}" << EOL
c.ServerApp.ip = '${IPADDR}'
c.ServerApp.port = ${port}
c.ServerApp.port_retries = 1
c.ServerApp.password = u'sha256:${SALT}:${PASSWORD_SHA}'
c.ServerApp.base_url = '/node/$(hostname)/${port}/'
c.ServerApp.open_browser = False
c.ServerApp.allow_origin = '*'
c.ServerApp.root_dir = '${HOME}'
c.ServerApp.disable_check_xsrf = True
EOL
)

echo "#### Starting the JupyterLab server" 
set -x
jupyter lab --config="${CONFIG_FILE}"
echo "#### Terminated notebookJupyterLab server. Done"


SHARK

Info

On SHARK, Jupyter notebooks are also available through the Open Ondemand portal which is most likely easier to make use of than this way

In this example, we will assume that you create a virtual environment and installed the jupyterlab package in it. Of course, you can also use condathe existing JupyterLab module on SHARK. This module will start a Jupyter server in which you can run notebooks.

The sbatch settings used here were chosen for demonstration purposes only.

Code Block
#!/bin/bash

#SBATCH --job-name=jupyter_notebook
#SBATCH --mem=8G
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --partition=short
#SBATCH --time=00:15:00
#SBATCH --output=%x_%j.out

unset XDG_RUNTIME_DIR
module load system/python/3.10.2

# Replacetools/jupyterlab/4.3.1

echo "Running the pathnotebook to your virtual environment as needed
echo "#### Sourcing virtual env"
source /path/to/your/virtualenv
echo "#### ... Done"

echo "#### Running the notebook on $(hostname)"
IPADDR=$(hostname -ion $(hostname)"
IPADDR=$(hostname -i)

port=8989

SALT="$(head /dev/urandom | tr -dc 'A-Z-a-z-0-9{}[]=+.:-_' | head -c 16;echo;)"
password="$(head /dev/urandom | tr -dc 'A-Z-a-z-0-9{}[]=+.:-_' | head -c 16;echo;)"
PASSWORD_SHA="$(echo -n "${password}${SALT}" | openssl dgst -sha256 | awk '{print $NF}')"
echo "-------------------------"
echo "Log in using the password: $password"
echo "-------------------------"

# the jupyter server config file
export CONFIG_FILE="${PWD}/config.py"

(
umask 077
cat > "${CONFIG_FILE}" << EOL
c.ServerApp.ip = '${IPADDR}'
c.ServerApp.port = ${port}
c.ServerApp.port_retries = 0
c.ServerApp.password = u'sha256:${SALT}:${PASSWORD_SHA}'
c.ServerApp.base_url = '/node/$(hostname)/${port}/'
c.ServerApp.open_browser = False
c.ServerApp.allow_origin = '*'
c.ServerApp.root_dir = '${HOME}'
c.ServerApp.disable_check_xsrf = True
EOL
)

echo "#### Starting the notebookJupyterLab server" srun
jupyterset notebook --ip=$IPADDR --no-browser-x
jupyter lab --port=8989config="${CONFIG_FILE}"
echo "#### Terminated JupyterLab notebookserver. Done"

Port 8989 which is used above is just an example and will be changed by the server if the port is not available.

Save the batch file for example as jupyter_notebook.slurm and submit it with sbatch jupyter_notebook.slurm.

Open the output file which in our example will be named something like jupyter_notebook_<jobid>.out and take note of the notebook token, the port that is used and the name of the node that the job is running on. Do not specify the node in the batch script. but let slurm take care of which node should be used.

Accessing the Jupyter notebook on your local workstation

On your local computer, set up the connection to the port of the web interface. You can for instance add into your ~/.ssh/config file the following lines:

Note

Port 8989 that is used here is just an example. If a notebook is already running on this node on this port, then Jupyter will launch the notebook on another port. You can see this in the slurm output file. In this case adjust the port in your ssh config setting.


ALICE

Code Block
Host alice-notebook
   HostName login1.alice.universiteitleiden.nl
   LocalForward 8989 <node_name>:8989
   ProxyJump <username>@ssh-gw.alice.universiteitleiden.nl:22
   User <username>
   ServerAliveInterval 60

where “<username> should be replaced by your ALICE user name and “<node_name> by the name of the node that your job is running on.


SHARK

Code Block
Host shark-notebook
   HostName res-hpc-lo02.researchlumc.nl
   LocalForward 8989 <node_name>:8989
   ProxyJump <username>@res-ssh-alg01.researchlumc.nl:22
   User <username>
   ServerAliveInterval 60

where “<username> should be replaced by your SHARK user name and “<node_name> by the name of the node that your job is running on.

If you work from within the LUMC network, you do not need the line with the ProxyJump setting.


You can always look up the node that your job is running on with squeue or scontrol show job <job_id> or by looking into the output file where the token is in (assuming that you included the line about printing out the hostname).

Then open a local terminal and run

Code Block
 ssh alice-notebook

Afterwards, open your browser and point it to

Code Block
 http://localhost:8989/?token=<taken form output slurm>
and voila' you can use jupyter notebooks on the cluster from your local workstation
8989/node/<node_name>/8989/lab

where you need to replace <node_name> by the compute node where the job is running on. Alternatively, the url is also listed in the Slurm output file.

The JupyterLab server will ask you for an access token which is the password printed in your slurm output file. After putting in the password, you can use the Jupyterlab server to run notebooks or even access your files from the browser on your local workstation.

Note

Always shutdown the server or terminate job if you are done and the job has not reached its time limit. DO NOT leave idle jobs running.