PrimeHub
  • Introduction
  • Installation
  • Tiers and Licenses
  • End-to-End Tutorial
    • 1 - MLOps Introduction and Scoping the Project
    • 2 - Train and Manage the Model
    • 3 - Compare, Register and Deploy the Model
    • 4 - Build the Web Application
    • 5 - Summary
  • User Guide
    • User Portal
    • Notebook
      • Notebook Tips
      • Advanced Settings
      • PrimeHub Notebook Extension
      • Submit Notebook as Job
    • Jobs
      • Job Artifacts
      • Tutorial
        • (Part1) MNIST classifier training
        • (Part2) MNIST classifier training
        • (Advanced) Use Job Submission to Tune Hyperparameters
        • (Advanced) Model Serving by Seldon
        • Job Artifacts Simple Usecase
    • Models
      • Manage and Deploy Model
      • Model Management Configuration
    • Deployments
      • Pre-packaged servers
        • TensorFlow server
        • PyTorch server
        • SKLearn server
        • Customize Pre-packaged Server
        • Run Pre-packaged Server Locally
      • Package from Language Wrapper
        • Model Image for Python
        • Model Image for R
        • Reusable Base Image
      • Prediction APIs
      • Model URI
      • Tutorial
        • Model by Pre-packaged Server
        • Model by Pre-packaged Server (PHFS)
        • Model by Image built from Language Wrapper
    • Shared Files
    • Datasets
    • Apps
      • Label Studio
      • MATLAB
      • MLflow
      • Streamlit
      • Tutorial
        • Create Your Own App
        • Create an MLflow server
        • Label Dataset by Label Studio
        • Code Server
    • Group Admin
      • Images
      • Settings
    • Generate an PrimeHub API Token
    • Python SDK
    • SSH Server Feature
      • VSCode SSH Notebook Remotely
      • Generate SSH Key Pair
      • Permission Denied
      • Connection Refused
    • Advanced Tutorial
      • Labeling the data
      • Notebook as a Job
      • Custom build the Seldon server
      • PrimeHub SDK/CLI Tools
  • Administrator Guide
    • Admin Portal
      • Create User
      • Create Group
      • Assign Group Admin
      • Create/Plan Instance Type
      • Add InfuseAI Image
      • Add Image
      • Build Image
      • Gitsync Secret for GitHub
      • Pull Secret for GitLab
    • System Settings
    • User Management
    • Group Management
    • Instance Type Management
      • NodeSelector
      • Toleration
    • Image Management
      • Custom Image Guideline
    • Volume Management
      • Upload Server
    • Secret Management
    • App Settings
    • Notebooks Admin
    • Usage Reports
  • Reference
    • Jupyter Images
      • repo2docker image
      • RStudio image
    • InfuseAI Images List
    • Roadmap
  • Developer Guide
    • GitHub
    • Design
      • PrimeHub File System (PHFS)
      • PrimeHub Store
      • Log Persistence
      • PrimeHub Apps
      • Admission
      • Notebook with kernel process
      • JupyterHub
      • Image Builder
      • Volume Upload
      • Job Scheduler
      • Job Submission
      • Job Monitoring
      • Install Helper
      • User Portal
      • Meta Chart
      • PrimeHub Usage
      • Job Artifact
      • PrimeHub Apps
    • Concept
      • Architecture
      • Data Model
      • CRDs
      • GraphQL
      • Persistence Storages
      • Persistence
      • Resources Quota
      • Privilege
    • Configuration
      • How to configure PrimeHub
      • Multiple Jupyter Notebook Kernels
      • Configure SSH Server
      • Configure Job Submission
      • Configure Custom Image Build
      • Configure Model Deployment
      • Setup Self-Signed Certificate for PrimeHub
      • Chart Configuration
      • Configure PrimeHub Store
    • Environment Variables
Powered by GitBook
On this page
  • Configure MinIO
  • Configure PHFS
  • Configure Log Persistence
  1. Developer Guide
  2. Configuration

Configure PrimeHub Store

PreviousChart ConfigurationNextEnvironment Variables

Last updated 2 years ago

PrimeHub Store is the central storage for storing PrimeHub files. Many features are based on PrimeHub store to persist, transfer, and load the data.

PrimeHub store selects as the backend and uses one bucket to store the data. To enable the PrimeHub store, set the store.enabled to true.

Path
Description
Default Value

store.enabled

If the PrimeHub store is enabled

false

store.accessKey

The access key for the PrimeHub store

AKIAIOSFODNN7EXAMPLE

store.secretKey

The secret key for the PrimeHub store

wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

store.bucket

The bucket name the PrimeHub store use

primehub

minio.*

The MinIO configuration

Configure MinIO

MinIO is installed if PrimeHub store is enabled. By default, MinIO data are stored in a PVC. However, we have these options to store the data.

  • Standalone mode: Store data in Kuberentes PVC

  • AWS S3 gateway: Store data in and use MinIO as gateway

  • Google Cloud Storage gateway: Store data in and use MinIO as gateway

Standalone mode

If minio.persistence.enabled is true, the MinIO would operate as standalone mode and one PVC is created. Here is the example for standalone mode:

store:
  bucket: "primehub"

minio:
  persistence:
    enabled: true
    storageClass: "gp2"
    accessMode: ReadWriteOnce
    size: 1024Gi

As the MinIO is installed, the bucket is also created automatically.

AWS S3 Gateway

  • Create an IAM user and get accessKey and secretKey

  • Attach the user with AWS S3 permissions policies

Here is the example for AWS S3:

store:
  bucket: "the-bucket-your-created"

minio:
  s3gateway:
    enabled: true
    accessKey: "[put-your-access-key-id-here]"
    secretKey: "[put-your-secret-access-key-here]"
store:
  bucket: "primehub"

minio:
  s3gateway:
    enabled: true
    serviceEndpoint: "http://rook-ceph-rgw-object-store.rook"
    accessKey: "[put-your-access-key-id-here]"
    secretKey: "[put-your-secret-access-key-here]"

Google Cloud Storage Gateway

Here is the example for GCS:

store:
  bucket: "the-bucket-your-created"

minio:
  gcsgateway:
    enabled: true
    projectId: "[your-project-id]"
    gcsKeyJson: "[the-content-of-your-json-key-file]"

Access the MinIO UI

You could export the MinIO web UI to the public domain http://${PRIMEHUB_DOMAIN}/minio

minio:
  ingress:
    enabled: true
    maxBodySize: "8192m"

Enabled ingress would export the handy MinIO object browser to /minio path. If you upload a large file and see the message 413 Request Entity Too Large. You could increase the value of maxBodySize.

However, the ingress only allows you to use MinIO UI. If you want to operate object with AWS S3 compatible library outside of the Kubernetes, do it with port-forward:

kubectl -n hub port-forward service/primehub-minio 9000

Configure PHFS

PHFS(PrimeHub File System) is the PrimeHub store based group sharing space. The group data is stored under mybucket/groups/<group>. It also a fundamental building block for group resources.

By default, if the Primehub store is enabled, the PHFS is enabled as well. But we can manually disable PHFS by configuring store.phfs.enabled as false.

Path
Description
Default Value

store.phfs.enabled

If PHFS is enabled

true

rclone.*

The rclone configuration

Please see the chart configuration

Configuration:

store:
  enabled: true
  phfs:
    enabled: true

The following components would be installed if PHFS is enabled.

  1. csi-rclone: A CSI implementation for mounting S3-compatible object storage.

  2. primehub store PVC: csi-rclone-provisioned PVC for PrimeHub store. We use it for mounting the MinIO bucket on the user's pod.

Note for MicroK8s

Because the default kubelet path for MicroK8s is not /var/lib/kubelet, we need to configure the kubelet path as follow

rclone:
  kubeletPath: '/var/snap/microk8s/common/var/lib/kubelet'

Configure Log Persistence

Log persistence enables logs to be stored persistently in the PrimeHub store under mybucket/logs. Currently, only job logs are supported to persist.

By default, if the PrimeHub store is enabled, the log persistence is enabled as well. But we can manually disable log persistence by configuring store.logPersistence.enabled as false.

Path
Description
Default Value

store.logPersistence.enabled

If the log persistence is enabled

true

fluentd.*

The fluentd configuration

The following components would be installed if log persistence is enabled

  1. fluentd: The collector to collect container logs and upload to PrimeHub store.

Please see the

According to , prepare the AWS S3 bucket before installation.

Choose an existing bucket or from Amazon S3 console

MinIO also supports to use s3 gateway to connect to . Here is the example for connecting to Ceph RGW by .

According to , prepare the GCS bucket before installation.

Choose a existing bucket or from Google Cloud Storage console

Please see the

MinIO
AWS S3
Google Cloud Storage
MinIO S3 Gateway
create a bucket
Ceph RGW
Rook
MinIO GCS Gateway
create a bucket
Create and manage service accounts
Generate json keyfile
chart configuration
chart configuration