Demo : Notebook as a Service

Integration of Jupyterhub (Jhub) & Jupyter Enterprise Gateway (JEG)

LAB Objective:

Demonstarte steps to realize Jupyter notebook as a service using Jupyter Enterprise Gateway(JEG) & Jupyterhub (JHub) to realize separation of Jupyter server (backend of JupyterLab) and (computation) Kernels.

LAB Overview:

LAB Steps:

  1. Setup a NFS server
    Assume that a NFS server (IP: 172.17.0.1) exports a share at the path: /home/nfs_share .

  2. Setup single node Microk8s cluster on Ubuntu machine

  3. Create namespaces

    kubectl create namespace enterprise-gateway
    kubectl create namespace jupyterhub
    
  4. Create PV & PVCs
    Use yaml file jhub_pvc.yaml to create :

    Contents of jhub_pvc.yaml

    kind: PersistentVolume
    apiVersion: v1
    metadata:
      name: nfs-pv
      labels:
        app: mission-control
        type: nfs-volume
    spec:
      capacity:
        storage: 15Gi
      accessModes:
      - ReadWriteMany
      nfs:
        server: 172.17.0.1
        path: "/home/nfs_share/claim"
    ---
    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      name: jhub-claim
      namespace: jupyterhub
      labels:
        app: mission-control
        type: nfs-volume
    spec:
      storageClassName: ""
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 15Gi
    

    Use yaml file kernelspecs_pvc.yaml to create pvc kernelspecs-pvc in namespace enterprise-gateway : 20MB nfs share allocated from nfs-client storage class to store kernelspecs.

    Contents of kernelspecs_pvc.yaml

    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      name: kernelspecs-pvc 
      namespace: enterprise-gateway
    spec:
      storageClassName: "nfs-client"
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 20Mi
    
  5. Deploy JEG to namespace enterprise-gateway

    git clone https://github.com/jupyter-server/enterprise_gateway
    mkdir eg
    helm template --output-dir ./eg enterprise-gateway enterprise-gateway/etc/kubernetes/helm/enterprise-gateway -n enterprise-gateway -f jeg_customized_values.yaml
    kubectl apply -f ./eg/enterprise-gateway/templates/
    

    Copy kernelspecs and kernel-launcher sciprts and j2 templates to the NFS share that corresponds to pvc kernelspecs-pvc . After copy, the NFS share file /directory structure looks like:

    .
    └── python_kubernetes
        ├── kernel.json
        └── scripts
            ├── kernel-pod.yaml.j2
            └── launch_kubernetes.py
    

    jeg_customized_values.yaml file contents:

    mirrorWorkingDirs: true
    service:
      type: "LoadBalancer"
      ports:
        # The primary port on which Enterprise Gateway is servicing requests.
        - name: "http"
          port: 8888
          targetPort: 8888
        # The  port on which Enterprise Gateway will receive kernel connection info responses.
        - name: "http-response"
          port: 8877
          targetPort: 8877
    deployment:
      # Update CPU/Memory as needed
      resources:
        limits:
          cpu: 1
          memory: 5Gi
        requests:
          cpu: 1
          memory: 2Gi
      # Update to deploy multiple replicas of EG.
      replicas: 1
      # Give Enteprise Gateway some time to gracefully shutdown
      terminationGracePeriodSeconds: 60
    kernelspecsPvc:
      enabled: true
      name: kernelspecs-pvc 
    kip:
      enabled: false 
    
    
  6. Helm deploy JHub to namespace jupyterhub

    helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
    helm repo update
    helm install jhub jupyterhub/jupyterhub -f jhub_customized_values.yaml --version=2.0.0 -n jupyterhub 
    

    jhub_customized_values.yaml file contents:

    hub:
      db:
        pvc:
          accessModes:
        - ReadWriteOnce
          storage: 1Gi
          storageClassName: ssd-hostpath
    singleuser:
      storage:
        type: static
        static:
          pvcName: jhub-claim
          subPath: "{username}"
        capacity: 5Gi
      networkPolicy:
        enabled: false
      image:
        name: docker.io/jupyter/base-notebook
        tag: hub-3.0.0
      cmd: 
        - "sh"
        - "-c"
        - "env \"KERNEL_PATH=/home/nfs_share/claim/$JUPYTERHUB_USER\" jupyterhub-singleuser" 
    
      defaultUrl: "/lab"
      extraEnv:
        JUPYTER_GATEWAY_URL: "http://enterprise-gateway.enterprise-gateway:8888"
        JUPYTERHUB_SINGLEUSER_APP: "jupyter_server.serverapp.ServerApp"
    
    

    Explanation of jhub_customized_values.yaml customization:

  7. Customize kernel.json file in kernelspecs’ nfs share
    Customize python-kubernetes/kernel.json file in kernelspecs’ nfs share to include environment variables KERNEL_VOLUME_MOUNTS and KERNEL_VOLUMES. These variables will be read by python-kubernetes/scripts/launch_kubernetes.py script to render kernel pod yaml file which includes mount of a nfs share at the path as specified by environment variable KERNEL_PATH.
    (When JHub launches a server for a user that connects to JEG ,KERNEL_PATH is one of enviornment variables that get passed to JEG. As KERNEL_VOLUMES in python-kubernetes/kernel.json makes reference to variable KERNEL_PATH, the kernel pod yaml file prepared by JEG will include a mount entry of NFS share at the path specified by KERNEL_PATH.)
    This will make a user’s Jupyter server pod and kernel pod have a common nfs share mapped to their home directories ( ‘/home/jovyan’).

    Contents of python_kubernetes/kernel.json

    {
      "language": "python",
      "display_name": "Python on Kubernetes",
      "metadata": {
        "process_proxy": {
          "class_name": "enterprise_gateway.services.processproxies.k8s.KubernetesProcessProxy",
          "config": {
            "image_name": "docker.io/elyra/kernel-py:dev"
          }
        },
        "debugger": true
      },
      "env": {
        "KERNEL_VOLUME_MOUNTS": "[{name: 'nfs-mount',mountPath: '/home/jovyan'}]",
        "KERNEL_VOLUMES": "[{name: 'nfs-mount',nfs: {server: '172.17.0.1', path: $KERNEL_PATH}}]"
      },
      "argv": [
        "python",
        "/usr/local/share/jupyter/kernels/python_kubernetes/scripts/launch_kubernetes.py",
        "--RemoteProcessProxy.kernel-id",
        "{kernel_id}",
        "--RemoteProcessProxy.port-range",
        "{port_range}",
        "--RemoteProcessProxy.response-address",
        "{response_address}",
        "--RemoteProcessProxy.public-key",
        "{public_key}"
      ]
    }
    

Screenshots:

PV

PVC

Services

References: