PrimeHub
v4.0
v4.0
  • Introduction
  • Installation
  • Tiers and Licenses
  • End-to-End Tutorial
    • 1 - MLOps Introduction and Scoping the Project
    • 2 - Train and Manage the Model
    • 3 - Compare, Register and Deploy the Model
    • 4 - Build the Web Application
    • 5 - Summary
  • User Guide
    • User Portal
    • Notebook
      • Notebook Tips
      • Advanced Settings
      • PrimeHub Notebook Extension
      • Submit Notebook as Job
    • Jobs
      • Job Artifacts
      • Tutorial
        • (Part1) MNIST classifier training
        • (Part2) MNIST classifier training
        • (Advanced) Use Job Submission to Tune Hyperparameters
        • (Advanced) Model Serving by Seldon
        • Job Artifacts Simple Usecase
    • Models
      • Manage and Deploy Model
      • Model Management Configuration
    • Deployments
      • Pre-packaged servers
        • TensorFlow server
        • PyTorch server
        • SKLearn server
        • Customize Pre-packaged Server
        • Run Pre-packaged Server Locally
      • Package from Language Wrapper
        • Model Image for Python
        • Model Image for R
        • Reusable Base Image
      • Prediction APIs
      • Model URI
      • Tutorial
        • Model by Pre-packaged Server
        • Model by Pre-packaged Server (PHFS)
        • Model by Image built from Language Wrapper
    • Shared Files
    • Datasets
    • Apps
      • Label Studio
      • MATLAB
      • MLflow
      • Streamlit
      • Tutorial
        • Create Your Own App
        • Create an MLflow server
        • Label Dataset by Label Studio
        • Code Server
    • Group Admin
      • Images
      • Settings
    • Generate an PrimeHub API Token
    • Python SDK
    • SSH Server Feature
      • VSCode SSH Notebook Remotely
      • Generate SSH Key Pair
      • Permission Denied
      • Connection Refused
    • Advanced Tutorial
      • Labeling the data
      • Notebook as a Job
      • Custom build the Seldon server
      • PrimeHub SDK/CLI Tools
  • Administrator Guide
    • Admin Portal
      • Create User
      • Create Group
      • Assign Group Admin
      • Create/Plan Instance Type
      • Add InfuseAI Image
      • Add Image
      • Build Image
      • Gitsync Secret for GitHub
      • Pull Secret for GitLab
    • System Settings
    • User Management
    • Group Management
    • Instance Type Management
      • NodeSelector
      • Toleration
    • Image Management
      • Custom Image Guideline
    • Volume Management
      • Upload Server
    • Secret Management
    • App Settings
    • Notebooks Admin
    • Usage Reports
  • Reference
    • Jupyter Images
      • repo2docker image
      • RStudio image
    • InfuseAI Images List
    • Roadmap
  • Developer Guide
    • GitHub
    • Design
      • PrimeHub File System (PHFS)
      • PrimeHub Store
      • Log Persistence
      • PrimeHub Apps
      • Admission
      • Notebook with kernel process
      • JupyterHub
      • Image Builder
      • Volume Upload
      • Job Scheduler
      • Job Submission
      • Job Monitoring
      • Install Helper
      • User Portal
      • Meta Chart
      • PrimeHub Usage
      • Job Artifact
      • PrimeHub Apps
    • Concept
      • Architecture
      • Data Model
      • CRDs
      • GraphQL
      • Persistence Storages
      • Persistence
      • Resources Quota
      • Privilege
    • Configuration
      • How to configure PrimeHub
      • Multiple Jupyter Notebook Kernels
      • Configure SSH Server
      • Configure Job Submission
      • Configure Custom Image Build
      • Configure Model Deployment
      • Setup Self-Signed Certificate for PrimeHub
      • Chart Configuration
      • Configure PrimeHub Store
    • Environment Variables
Powered by GitBook
On this page
  • Installation
  • Job Settings
  • Job Artifacts
  • Log Persistence.
  1. Developer Guide
  2. Configuration

Configure Job Submission

Installation

For PrimeHub EE, the job submission is enabled by default. Here is the advanced configuration for job submission

Job Settings

Path
Description
Default Value

jobSubmission.workingDirSize

5Gi

jobSubmission.defaultActiveDeadlineSeconds

Default timeout (seconds) for a running job

86400

jobSubmission.defaultTTLSecondsAfterFinished

Default TTL (seconds) to delete the pod for a finished job

604800

jobSubmission.nodeSelector

{}

jobSubmission.affinity

{}

jobSubmission.tolerations

[]

jobSubmission.jobTTLSeconds

the retention of a job that it will be kept in PrimeHub after the job (succeeded, failed, cancelled). The default value is 30 days. Zero value means unlimited.

2592000

jobSubmission.jobLimit

The limit of total amount of jobs, the oldest job will be removed if the limit is exceeded; Zero value means unlimited.

4000

Example:

jobSubmission:
  workingDirSize: '5Gi'
  defaultActiveDeadlineSeconds: 86400
  defaultTTLSecondsAfterFinished: 604800
  nodeSelector: {}
  affinity: {}
  tolerations: []

Job Artifacts

Path
Description
Default Value

jobSubmission.artifact.enabled

If the job artifact feature is enabled

true

jobSubmission.artifact.limitSizeMb

The total size of artifacts a job can upload

100

jobSubmission.artifact.limitFiles

The total files a job can upload

1000

jobSubmission.artifact.retentionSeconds

How long would the artifacts preserve

604800

Example:

# Job artifact feature require primehub store and PHFS
store:
  enabled: true
  phfs:
    enabled: true
jobSubmission:
  artifact:
    enabled: true
    limitSizeMb: 100
    limitFiles: 1000
    retentionSeconds: 604800

Log Persistence.

By default, the job submission log is persistent for 7 days (configured by jobSubmission.defaultTTLSecondsAfterFinished). The log is removed once the underlying pod is deleted. Log persistence feature allows to upload log to primehub store.

Path
Description
Default Value

store.enabled

If the PrimeHub store is enabled

false

store.logPersistence.enabled

If the log persistence is enabled

true

fluentd.flushAtShutdown

false

fluentd.flushInterval

3600s

fluentd.chunkLimitSize

"256m"

fluentd.storeAs

txt

fluentd.*

The other fluentd settings

store:
  enabled: true
  logPersistence:
    enabled: true
fluentd:
  # Buffer configuration: https://docs.fluentd.org/configuration/buffer-section
  flushAtShutdown: false
  flushInterval: "3600s"
  chunkLimitSize: "256m"
  # S3 Configuration: https://docs.fluentd.org/output/s3
  storeAs: "txt"
PreviousConfigure SSH ServerNextConfigure Custom Image Build

Last updated 2 years ago

The size of ephemeral storage for working directory. The format of unit is defined in

The default for the underlying pod

The default setting for the underlying pod

The default setting for the underlying pod

Flush when flunetd is shutdown. Please see flush_interval setting in

The flush interval. Please see flush_interval in

The max size of each chunks. Please see chunk_limit_size setting in

The log format stored in the store. We supports txt or gzip. Please see store_as setting in

Please see the

kubernetes document
node selector
affinity
tolerations
flunetd buffer document
flunetd buffer document
flunetd buffer document
flunetd s3 plugin document
chart configuration