It is possible to run Jupyter notebooks interactively with a Slurm batch job, i.e., not using an interactive job.
If you want to access the Jupyter notebook web interface from your local workstation, you must make sure that you have a tunnel available to the port on which the web interface serves the Jupyter notebook output. This can be achieved with a few simple settings in the batch job and the ssh config on your local workstation.
Setting up the batch job
First, we need to set up the batch job. Login to ALICE or SHARK and create a slurm batch file like this one depending on the cluster that you work on:
ALICE
In this example, we will make use of an existing module on ALICE. If you need a different version or additional packages, you can always create your own virtual environment and install everything in there.
The sbatch settings used here were chosen for demonstration purposes only.
#!/bin/bash #SBATCH --job-name=jupyter_notebook #SBATCH --mem=8G #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 #SBATCH --partition=cpu-short #SBATCH --time=01:00:00 #SBATCH --output=%x_%j.out unset XDG_RUNTIME_DIR module load IPython/7.7.0-foss-2019a-Python-3.7.2 echo "Running the notebook on $(hostname)" IPADDR=$(hostname -i) echo "#### Starting notebook" srun jupyter notebook --ip=$IPADDR --no-browser --port=8989 echo "#### Terminated notebook. Done"
SHARK
On SHARK, Jupyter notebooks are also available through the Open Ondemand portal which is most likely easier to make use of than this way
In this example, we will assume that you create a virtual environment and installed the jupyterlab package in it. Of course, you can also use conda.
The sbatch settings used here were chosen for demonstration purposes only.
#!/bin/bash #SBATCH --job-name=jupyter_notebook #SBATCH --mem=8G #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 #SBATCH --partition=short #SBATCH --time=00:15:00 #SBATCH --output=%x_%j.out unset XDG_RUNTIME_DIR module load system/python/3.10.2 # Replace the path to your virtual environment as needed echo "#### Sourcing virtual env" source /path/to/your/virtualenv echo "#### ... Done" echo "#### Running the notebook on $(hostname)" IPADDR=$(hostname -i) echo "#### Starting notebook" srun jupyter notebook --ip=$IPADDR --no-browser --port=8989 echo "#### Terminated notebook. Done"
Save the batch file for example as jupyter_notebook.slurm
and submit it with sbatch jupyter_notebook.slurm
.
Open the output file which in our example will be named something like jupyter_notebook_<jobid>.out
and take note of the notebook token, the port that is used and the name of the node that the job is running on. Do not specify the node in the batch script. but let slurm take care of which node should be used.
Accessing the Jupyter notebook on your local workstation
On your local computer, set up the connection to the port of the web interface. You can for instance add into your ~/.ssh/config file the following lines:
Port 8989 that is used here is just an example. If a notebook is already running on this node on this port, then Jupyter will launch the notebook on another port. You can see this in the slurm output file. In this case adjust the port in your ssh config setting.
ALICE
Host alice-notebook HostName login1.alice.universiteitleiden.nl LocalForward 8989 <node_name>:8989 ProxyJump <username>@ssh-gw.alice.universiteitleiden.nl:22 User <username> ServerAliveInterval 60
where “<username> should be replaced by your ALICE user name and “<node_name> by the name of the node that your job is running on.
SHARK
Host shark-notebook HostName res-hpc-lo02.researchlumc.nl LocalForward 8989 <node_name>:8989 ProxyJump <username>@res-ssh-alg01.researchlumc.nl:22 User <username> ServerAliveInterval 60
where “<username> should be replaced by your SHARK user name and “<node_name> by the name of the node that your job is running on.
If you work from within the LUMC network, you do not need the line with the ProxyJump setting.
You can always look up the node that your job is running on with squeue
or scontrol show job <job_id>
or by looking into the output file where the token is in (assuming that you included the line about printing out the hostname).
Then open a local terminal and run
ssh alice-notebook
Afterwards, open your browser and point it to
http://localhost:8989/?token=<taken form output slurm>
and voila' you can use jupyter notebooks on the cluster from your local workstation.