PrimeHub
  • Introduction
  • Installation
  • Tiers and Licenses
  • End-to-End Tutorial
    • 1 - MLOps Introduction and Scoping the Project
    • 2 - Train and Manage the Model
    • 3 - Compare, Register and Deploy the Model
    • 4 - Build the Web Application
    • 5 - Summary
  • User Guide
    • User Portal
    • Notebook
      • Notebook Tips
      • Advanced Settings
      • PrimeHub Notebook Extension
      • Submit Notebook as Job
    • Jobs
      • Job Artifacts
      • Tutorial
        • (Part1) MNIST classifier training
        • (Part2) MNIST classifier training
        • (Advanced) Use Job Submission to Tune Hyperparameters
        • (Advanced) Model Serving by Seldon
        • Job Artifacts Simple Usecase
    • Models
      • Manage and Deploy Model
      • Model Management Configuration
    • Deployments
      • Pre-packaged servers
        • TensorFlow server
        • PyTorch server
        • SKLearn server
        • Customize Pre-packaged Server
        • Run Pre-packaged Server Locally
      • Package from Language Wrapper
        • Model Image for Python
        • Model Image for R
        • Reusable Base Image
      • Prediction APIs
      • Model URI
      • Tutorial
        • Model by Pre-packaged Server
        • Model by Pre-packaged Server (PHFS)
        • Model by Image built from Language Wrapper
    • Shared Files
    • Datasets
    • Apps
      • Label Studio
      • MATLAB
      • MLflow
      • Streamlit
      • Tutorial
        • Create Your Own App
        • Create an MLflow server
        • Label Dataset by Label Studio
        • Code Server
    • Group Admin
      • Images
      • Settings
    • Generate an PrimeHub API Token
    • Python SDK
    • SSH Server Feature
      • VSCode SSH Notebook Remotely
      • Generate SSH Key Pair
      • Permission Denied
      • Connection Refused
    • Advanced Tutorial
      • Labeling the data
      • Notebook as a Job
      • Custom build the Seldon server
      • PrimeHub SDK/CLI Tools
  • Administrator Guide
    • Admin Portal
      • Create User
      • Create Group
      • Assign Group Admin
      • Create/Plan Instance Type
      • Add InfuseAI Image
      • Add Image
      • Build Image
      • Gitsync Secret for GitHub
      • Pull Secret for GitLab
    • System Settings
    • User Management
    • Group Management
    • Instance Type Management
      • NodeSelector
      • Toleration
    • Image Management
      • Custom Image Guideline
    • Volume Management
      • Upload Server
    • Secret Management
    • App Settings
    • Notebooks Admin
    • Usage Reports
  • Reference
    • Jupyter Images
      • repo2docker image
      • RStudio image
    • InfuseAI Images List
    • Roadmap
  • Developer Guide
    • GitHub
    • Design
      • PrimeHub File System (PHFS)
      • PrimeHub Store
      • Log Persistence
      • PrimeHub Apps
      • Admission
      • Notebook with kernel process
      • JupyterHub
      • Image Builder
      • Volume Upload
      • Job Scheduler
      • Job Submission
      • Job Monitoring
      • Install Helper
      • User Portal
      • Meta Chart
      • PrimeHub Usage
      • Job Artifact
      • PrimeHub Apps
    • Concept
      • Architecture
      • Data Model
      • CRDs
      • GraphQL
      • Persistence Storages
      • Persistence
      • Resources Quota
      • Privilege
    • Configuration
      • How to configure PrimeHub
      • Multiple Jupyter Notebook Kernels
      • Configure SSH Server
      • Configure Job Submission
      • Configure Custom Image Build
      • Configure Model Deployment
      • Setup Self-Signed Certificate for PrimeHub
      • Chart Configuration
      • Configure PrimeHub Store
    • Environment Variables
Powered by GitBook
On this page
  • Introduction
  • Installation
  • FAQ
  1. Developer Guide
  2. Configuration

Configure SSH Server

Introduction

With SSH bastion server, users are able to connect to their jupyter notebooks directly using SSH connections.

By setting up a SSH bastion server and exposing the TCP service, users could SSH into their Jupyter notebook if users toggle the Enable SSH Server option in the spawner page.

The bastion server also fetches public keys of users from Jupyter pods and cache them to speed up the SSH authorization.

The bastion server pod has very strict network policies and only allow to reach Jupyter notebook pods with SSH Server enabled.

This is a very practical feature, there are various of things you can do, for example:

  • Port-forward services to your machine

  • Connect to your Jupyter notebook workspace in your favorite editor

Installation

Enable SSH Bastion Server Feature

This feature is default enabled in PrimeHub CE version, if you're using EE version you'll need to enable this manually.

In your helm_override/primehub.yaml simply add this section and helmfile apply.

sshBastionServer:
  enabled: true

You may need to restart the hub pod manually to reload the config.

Configuration

SSH bastion server will access Kubernetes API to get pods' info. The default API port is 6443.

If your Kubernetes API listens on other ports, you need to specify the port in helm_override/primehub.yaml. For instance, microk8s is using 16443 as default API port. The configuration will look like this:

sshBastionServer:
  enabled: true
  netpol:
    kubeApiPort: 16443

To obtain Kubernetes API info, please run the following command.

kubectl get services kubernetes -o custom-columns=NAME:.metadata.name,IP:.spec.clusterIP,PORT:.spec.ports[0].targetPort

Allow SSH connection

You'll need to allow external SSH connection to your ingress / loadbalancer

And don't forget to allow the TCP port (default 2222 port) on your firewall or security group.

Here's some setup example in different environments:

On-Premises (NGINX Ingress)

For NGINX Ingress Controller, you'll have to edit the tcp-services configmap to expose certain TCP 2222 port to the SSH bastion server port.

$ kubectl edit -n ingress-nginx tcp-services
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: tcp-services
  namespace: ingress-nginx
data:
  "2222": hub/ssh-bastion-server:2222

Google Kubernetes Engine

metadata:
  name: foo
  annotations:
    kubernetes.io/ingress.class: "nginx"

FAQ

What's the logic of the SSH key cache mechanism?

  • 15 min since last update cache → cache invalid, the whole cache will be fully rebuild upon the next ssh connection.

  • 2 min since last update cache → if incoming ssh key is not in cache store, try to fetch publickey APIs of those pods not in the cache.

  • cache updated within 2 min → only check incoming SSH key from local cache.

How do I refresh the SSH key cache manually?

If you are able to reach your Kubernetes cluster, you can run the following command to refresh the SSH key cache manually.

$ POD_NAME=$(kubectl get pod -n hub --selector=ssh-bastion-server/bastion=true -o jsonpath='{.items[*].metadata.name}')
$ kubectl exec -it -n hub $POD_NAME bash
root@ssh-bastion-server:/# cd /etc/ssh
root@ssh-bastion-server:/# python update_authorized_keys.py full
root@ssh-bastion-server:/# exit
PreviousMultiple Jupyter Notebook KernelsNextConfigure Job Submission

Last updated 2 years ago

For more detail about exposing TCP services of NGINX Ingress Controller, you can check out the .

If you're using , it doesn't support TCP proxy. You need to specify the annotation kubernetes.io/ingress.class: "nginx" in all ingresses to target NGINX ingress controller.

After that, you'll have to edit the firewall to expose certain TCP 2222 port. Go to in Google Cloud Platform console, then create the firewall rule.

official document
GCE Ingress Controller
Firewall page