JupyterHub FirecRESTSpawner¶

For most setups, installing JupyterHub in HPC Centers requires the service to be installed on premises, while sharing workload scheduler admin keys and sudo access.

Using FirecREST to spawn jobs as an abstraction for workload schedulers, enables HPC sysadmins to avoid these issues, allowing the service to be installed in cloud providers, avoiding the use of admin credentials installed in the service, and to use the same setup for systems with different schedulers and Identity Providers.

This tutorial explains how to run JupyterHub with FirecRESTSpawner using the Docker demo of FirecREST API v2.

FirecRESTSpawner is a tool for launching Jupyter Notebook servers from JupyterHub on HPC clusters through FirecREST. It supports both version 1 and version 2 of the API. It can be deployed on Kubernetes as part of JupyterHub and configured to target different systems.

In this tutorial, we will set up a simplified environment on a local machine, including:

a Docker Compose deployment of FirecREST, a single-node Slurm cluster and a Keycloak server which will be used as identity provider for the authentication
a local installation of JupyterHub, configured to launch notebooks on the Slurm cluster

This deployment not only demonstrates the use case but also serves as a platform for testing and developing FirecRESTSpawner.

Requirements¶

For this tutorial you will need

a recent installation of Docker, which includes the docker compose command (or the older docker-compose command line tool)
a Python installation (version 3.9 or higher)

Deployment of FirecREST and Slurm cluster¶

This tutorial builds on the Docker demo of FirecREST. We will use the small docker-compose-jhub.yaml file to override some settings in the FirecREST demo. This can be done by passing both files to the docker compose command.

First, we clone the FirecREST repository

Cloning FirecREST official repo

git clone https://github.com/eth-cscs/firecrest-v2.git

and then launch the deployment

Launch docker compose

cd firecrest-v2
export JHUB_DOCKERFILE_DIR=./docs/use_cases/jupyterhub/
docker compose -f docker-compose.yml -f $JHUB_DOCKERFILE_DIR/docker-compose-jhub.yaml up

This step takes a few minutes. In the meanwhile we can install JupyterHub on a local virtual environment.

Install JupyterHub and FirecRESTSpawner¶

An easy way to install JupyterHub is via Miniconda. We need to download the Miniconda installer for our platforms and install it using the following command (replace <arch> with the architecture on your workstation, for instance Linux-x86_64 or MaxOSX-arm64):

Install Miniconda

bash Miniconda3-latest-<arch>.sh -p /path/to/mc-jhub -b

Here we use -p to pass the absolute path to the install directory and -b to accept the terms of service.

We can activate our conda base environment and install configurable-http-proxy, JupyterHub and FirecRESTSpawner:

Install JupyterHub using FirecRESTSpawner

. /path/to/mc-jhub/bin/activate
conda install -y configurable-http-proxy
pip install --no-cache jupyterhub==4.1.6 pyfirecrest==3.0.1 SQLAlchemy==1.4.52 oauthenticator==16.3.1 python-hostlist==1.23.0
git clone https://github.com/eth-cscs/firecrestspawner.git
cd firecrestspawner
git checkout fcv2
pip install --no-cache .

Back to the deployment¶

Once the images have been built, you can check that all containers are running

Check containers are running

$ docker compose -p firecrest-v2 ps --format 'table {{.ID}}\t{{.Name}}\t{{.State}}'

That should show something like this

Example

CONTAINER ID   NAME                       STATE
fd8b1575bf18   firecrest-v2-firecrest-1   running
4a6e4c3d089d   firecrest-v2-keycloak-1    running
36f8d98f3b67   firecrest-v2-minio-1       running
85de3b4afe95   firecrest-v2-slurm-1       running

When we are done with the tutorial, the deployment can be shutdown by pressing ctrl+c and then:

Shut down docker containers

cd firecrest-v2
docker compose -f docker-compose.yml -f /path/to/tutorial/docker-compose-jhub.yaml down

Setting up the authorization¶

A requirement for running JupyterHub with FirecRESTSpawner is to use an authenticator that prompts users for login and password in exchange for an OIDC/OAuth2 access token. That token is then be passed to the spawner, allowing users to authenticate with FirecREST when submitting, stopping or polling for jobs. For this purpose, we will use an Authorization Code Flow client, which for demonstration purposes is already created in our demo environment.

You can check the configuration of this client visiting the Clients page in Keycloak (username: admin and password: admin2) within the kcrealm realm.

In that view, you'll find the client jhub-client listed on the "Clients" tab of the side panel.

Launching JupyterHub¶

The configuration file provided in this tutorial has all the settings needed for using JupyterHub with our deployment.

Depending on the platform and Docker setup, you may need to adjust a few lines in the configuration to set the correct host IP address for the Docker bridge network. On most Linux systems, you can find this address with ip addr show docker0. It's typically 172.17.0.1. If JupyterHub gets a timeout when launching a notebook, you can try replacing the two instances of host.docker.internal in the configuration by that ip.

Now we can run JupyterHub with

Run JupyterHub locally

cd $JHUB_DOCKERFILE_DIR
source /path/to/mc-jhub/bin/activate
source env.sh
jupyterhub --config jupyterhub-config.py --port 8003 --ip 127.0.0.1

Here we are sourcing the file env.sh which defines environment variables needed by the spawner (more information can be found here). We use the port 8003 for the JupyterHub since the default one 8000 is already used for FirecREST in the deployment. The IP address 127.0.0.1 is necessary to allow JupyterLab to connect back to the JupyterHub.

JupyterHub should be accessible in the browser at http://localhost:8003 and it should be possible to launch notebooks on the slurm cluster.

To access the interface use the following credentials:

Username: fireuser
Password: password