Running Jupyter notebooks in an interactive session

It is possible to run Jupyter notebooks (in a Jupyter server session) interactively with a Slurm batch job, i.e., not using an interactive job.

If you want to access the Jupyter server web interface from your local workstation, you must make sure that you have a tunnel available to the port on which the web interface serves the Jupyter server output. This can be achieved with a few simple settings in the batch job and the ssh config on your local workstation.

Setting up the batch job

First, we need to set up the batch job. Login to ALICE or SHARK and create a slurm batch file like this one depending on the cluster that you work on:


ALICE

In this example, we will make use of the existing JupyterLab module on ALICE. This module will start a Jupyter server in which you can run notebooks.

If you need additional packages, you can always create your own virtual environment and install everything in there.

The sbatch settings used here were chosen for demonstration purposes only.

#!/bin/bash #SBATCH --job-name=jupyter_notebook #SBATCH --mem=4G #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 #SBATCH --partition=cpu-short #SBATCH --time=01:00:00 #SBATCH --output=%x_%j.out unset XDG_RUNTIME_DIR module load ALICE/default module load JupyterLab echo "Running the notebook on $(hostname)" IPADDR=$(hostname -i) port=8989 SALT="$(head /dev/urandom | tr -dc 'A-Z-a-z-0-9{}[]=+.:-_' | head -c 16;echo;)" password="$(head /dev/urandom | tr -dc 'A-Z-a-z-0-9{}[]=+.:-_' | head -c 16;echo;)" PASSWORD_SHA="$(echo -n "${password}${SALT}" | openssl dgst -sha256 | awk '{print $NF}')" echo "-------------------------" echo "Log in using the password: $password" echo "-------------------------" # the jupyter server config file export CONFIG_FILE="${PWD}/config.py" ( umask 077 cat > "${CONFIG_FILE}" << EOL c.ServerApp.ip = '${IPADDR}' c.ServerApp.port = ${port} c.ServerApp.port_retries = 1 c.ServerApp.password = u'sha256:${SALT}:${PASSWORD_SHA}' c.ServerApp.base_url = '/node/$(hostname)/${port}/' c.ServerApp.open_browser = False c.ServerApp.allow_origin = '*' c.ServerApp.root_dir = '${HOME}' c.ServerApp.disable_check_xsrf = True EOL ) echo "#### Starting the JupyterLab server" set -x jupyter lab --config="${CONFIG_FILE}" echo "#### Terminated JupyterLab server. Done"

 


SHARK

On SHARK, Jupyter notebooks are also available through the Open Ondemand portal which is most likely easier to make use of than this way

In this example, we will the existing JupyterLab module on SHARK. This module will start a Jupyter server in which you can run notebooks.

The sbatch settings used here were chosen for demonstration purposes only.

#!/bin/bash #SBATCH --job-name=jupyter_notebook #SBATCH --mem=8G #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 #SBATCH --partition=short #SBATCH --time=00:15:00 #SBATCH --output=%x_%j.out unset XDG_RUNTIME_DIR tools/jupyterlab/4.3.1 echo "Running the notebook on $(hostname)" IPADDR=$(hostname -i) port=8989 SALT="$(head /dev/urandom | tr -dc 'A-Z-a-z-0-9{}[]=+.:-_' | head -c 16;echo;)" password="$(head /dev/urandom | tr -dc 'A-Z-a-z-0-9{}[]=+.:-_' | head -c 16;echo;)" PASSWORD_SHA="$(echo -n "${password}${SALT}" | openssl dgst -sha256 | awk '{print $NF}')" echo "-------------------------" echo "Log in using the password: $password" echo "-------------------------" # the jupyter server config file export CONFIG_FILE="${PWD}/config.py" ( umask 077 cat > "${CONFIG_FILE}" << EOL c.ServerApp.ip = '${IPADDR}' c.ServerApp.port = ${port} c.ServerApp.port_retries = 0 c.ServerApp.password = u'sha256:${SALT}:${PASSWORD_SHA}' c.ServerApp.base_url = '/node/$(hostname)/${port}/' c.ServerApp.open_browser = False c.ServerApp.allow_origin = '*' c.ServerApp.root_dir = '${HOME}' c.ServerApp.disable_check_xsrf = True EOL ) echo "#### Starting the JupyterLab server" set -x jupyter lab --config="${CONFIG_FILE}" echo "#### Terminated JupyterLab server. Done"

Port 8989 which is used above is just an example and will be changed by the server if the port is not available.

Save the batch file for example as jupyter_notebook.slurm and submit it with sbatch jupyter_notebook.slurm.

Open the output file which in our example will be named something like jupyter_notebook_<jobid>.out and take note of the notebook token, the port that is used and the name of the node that the job is running on. Do not specify the node in the batch script. but let slurm take care of which node should be used.

Accessing the Jupyter notebook on your local workstation

On your local computer, set up the connection to the port of the web interface. You can for instance add into your ~/.ssh/config file the following lines:

Port 8989 that is used here is just an example. If a notebook is already running on this node on this port, then Jupyter will launch the notebook on another port. You can see this in the slurm output file. In this case adjust the port in your ssh config setting.


ALICE

Host alice-notebook HostName login1.alice.universiteitleiden.nl LocalForward 8989 <node_name>:8989 ProxyJump <username>@ssh-gw.alice.universiteitleiden.nl:22 User <username> ServerAliveInterval 60

where “<username> should be replaced by your ALICE user name and “<node_name> by the name of the node that your job is running on.


SHARK

where “<username> should be replaced by your SHARK user name and “<node_name> by the name of the node that your job is running on.

If you work from within the LUMC network, you do not need the line with the ProxyJump setting.


You can always look up the node that your job is running on with squeue or scontrol show job <job_id> or by looking into the output file where the token is in (assuming that you included the line about printing out the hostname).

Then open a local terminal and run

Afterwards, open your browser and point it to

where you need to replace <node_name> by the compute node where the job is running on. Alternatively, the url is also listed in the Slurm output file.

The JupyterLab server will ask you for an access token which is the password printed in your slurm output file. After putting in the password, you can use the Jupyterlab server to run notebooks or even access your files from the browser on your local workstation.

Always shutdown the server or terminate job if you are done and the job has not reached its time limit. DO NOT leave idle jobs running.