PrimeHub
  • Introduction
  • Installation
  • Tiers and Licenses
  • End-to-End Tutorial
    • 1 - MLOps Introduction and Scoping the Project
    • 2 - Train and Manage the Model
    • 3 - Compare, Register and Deploy the Model
    • 4 - Build the Web Application
    • 5 - Summary
  • User Guide
    • User Portal
    • Notebook
      • Notebook Tips
      • Advanced Settings
      • PrimeHub Notebook Extension
      • Submit Notebook as Job
    • Jobs
      • Job Artifacts
      • Tutorial
        • (Part1) MNIST classifier training
        • (Part2) MNIST classifier training
        • (Advanced) Use Job Submission to Tune Hyperparameters
        • (Advanced) Model Serving by Seldon
        • Job Artifacts Simple Usecase
    • Models
      • Manage and Deploy Model
      • Model Management Configuration
    • Deployments
      • Pre-packaged servers
        • TensorFlow server
        • PyTorch server
        • SKLearn server
        • Customize Pre-packaged Server
        • Run Pre-packaged Server Locally
      • Package from Language Wrapper
        • Model Image for Python
        • Model Image for R
        • Reusable Base Image
      • Prediction APIs
      • Model URI
      • Tutorial
        • Model by Pre-packaged Server
        • Model by Pre-packaged Server (PHFS)
        • Model by Image built from Language Wrapper
    • Shared Files
    • Datasets
    • Apps
      • Label Studio
      • MATLAB
      • MLflow
      • Streamlit
      • Tutorial
        • Create Your Own App
        • Create an MLflow server
        • Label Dataset by Label Studio
        • Code Server
    • Group Admin
      • Images
      • Settings
    • Generate an PrimeHub API Token
    • Python SDK
    • SSH Server Feature
      • VSCode SSH Notebook Remotely
      • Generate SSH Key Pair
      • Permission Denied
      • Connection Refused
    • Advanced Tutorial
      • Labeling the data
      • Notebook as a Job
      • Custom build the Seldon server
      • PrimeHub SDK/CLI Tools
  • Administrator Guide
    • Admin Portal
      • Create User
      • Create Group
      • Assign Group Admin
      • Create/Plan Instance Type
      • Add InfuseAI Image
      • Add Image
      • Build Image
      • Gitsync Secret for GitHub
      • Pull Secret for GitLab
    • System Settings
    • User Management
    • Group Management
    • Instance Type Management
      • NodeSelector
      • Toleration
    • Image Management
      • Custom Image Guideline
    • Volume Management
      • Upload Server
    • Secret Management
    • App Settings
    • Notebooks Admin
    • Usage Reports
  • Reference
    • Jupyter Images
      • repo2docker image
      • RStudio image
    • InfuseAI Images List
    • Roadmap
  • Developer Guide
    • GitHub
    • Design
      • PrimeHub File System (PHFS)
      • PrimeHub Store
      • Log Persistence
      • PrimeHub Apps
      • Admission
      • Notebook with kernel process
      • JupyterHub
      • Image Builder
      • Volume Upload
      • Job Scheduler
      • Job Submission
      • Job Monitoring
      • Install Helper
      • User Portal
      • Meta Chart
      • PrimeHub Usage
      • Job Artifact
      • PrimeHub Apps
    • Concept
      • Architecture
      • Data Model
      • CRDs
      • GraphQL
      • Persistence Storages
      • Persistence
      • Resources Quota
      • Privilege
    • Configuration
      • How to configure PrimeHub
      • Multiple Jupyter Notebook Kernels
      • Configure SSH Server
      • Configure Job Submission
      • Configure Custom Image Build
      • Configure Model Deployment
      • Setup Self-Signed Certificate for PrimeHub
      • Chart Configuration
      • Configure PrimeHub Store
    • Environment Variables
Powered by GitBook
On this page
  • Features
  • Non-Goal
  • Use cases
  • Design
  1. Developer Guide
  2. Design

PrimeHub Store

PreviousPrimeHub File System (PHFS)NextLog Persistence

Last updated 2 years ago

PrimeHub store is the central storage of PrimeHub and its backend is object storage. We use MinIO as the object storage solution. It supports launching as a standalone object store server or a gateway to connect the most popular cloud object storage solutions (e.g. AWS s3, google cloud GCS, azure blob...)

Unlike user volume, group volume, and data volume, which are designed for storing user's data, PrimeHub store is designed for storing both system and user data.

Features

  1. Central Storage for PrimeHub

  2. Supports s3-compatible REST API

  3. Allows pods to mount PrimeHub store through PVC

Non-Goal

  1. Does not provide a new volume type to connect to object storage.

Use cases

  1. : Currently, the job submission log is to retrieve the log from a pod. With the PrimeHub store, we can collect the log and store them in the PrimeHub store. The log can still be accessible even the pod is deleted.

  2. : new shared storage for groups.

Design

MinIO

  1. MinIO Standalone

  2. MinIO Gateway for S3

  3. MinIO Gateway for GCS

MinIO hides the different implementation for different persistence backend and provides a consistent way for other PrimeHub components to access the PrimeHub store.

GraphQL and PrimeHub services

GraphQL and other PrimeHub services use MinIO s3-compatible REST API to access the PrimeHub store.

Csi-rclone

As the PrimeHub is installed, there is also a csi-rclone provisioned PVC is also created. Usually, the PVC name is primehub-store.

Folder Structure in the bucket

The folder structure of the PrimeHub store is defined as follows.

The top-level folder is PrimeHub store relative features.

.
├── <feature1>
├── <feature2>
├── logs        # Log Persistence
└── groups      # PHFS
    ├── phusers
    ├── <group1>
    └── <group2>

is the key component of the Primehub Store. We support three persistence backend

implements Container Storage Interface (CSI) plugin that allows using as storage backend. We use it to mount PrimeHub store in the user Pods. It allows users to access PrimeHub store data by the file system.

MinIO
csi-rclone
rclone mount
Log Persistence
PrimeHub File System (PHFS)