PrimeHub
  • Introduction
  • Installation
  • Tiers and Licenses
  • End-to-End Tutorial
    • 1 - MLOps Introduction and Scoping the Project
    • 2 - Train and Manage the Model
    • 3 - Compare, Register and Deploy the Model
    • 4 - Build the Web Application
    • 5 - Summary
  • User Guide
    • User Portal
    • Notebook
      • Notebook Tips
      • Advanced Settings
      • PrimeHub Notebook Extension
      • Submit Notebook as Job
    • Jobs
      • Job Artifacts
      • Tutorial
        • (Part1) MNIST classifier training
        • (Part2) MNIST classifier training
        • (Advanced) Use Job Submission to Tune Hyperparameters
        • (Advanced) Model Serving by Seldon
        • Job Artifacts Simple Usecase
    • Models
      • Manage and Deploy Model
      • Model Management Configuration
    • Deployments
      • Pre-packaged servers
        • TensorFlow server
        • PyTorch server
        • SKLearn server
        • Customize Pre-packaged Server
        • Run Pre-packaged Server Locally
      • Package from Language Wrapper
        • Model Image for Python
        • Model Image for R
        • Reusable Base Image
      • Prediction APIs
      • Model URI
      • Tutorial
        • Model by Pre-packaged Server
        • Model by Pre-packaged Server (PHFS)
        • Model by Image built from Language Wrapper
    • Shared Files
    • Datasets
    • Apps
      • Label Studio
      • MATLAB
      • MLflow
      • Streamlit
      • Tutorial
        • Create Your Own App
        • Create an MLflow server
        • Label Dataset by Label Studio
        • Code Server
    • Group Admin
      • Images
      • Settings
    • Generate an PrimeHub API Token
    • Python SDK
    • SSH Server Feature
      • VSCode SSH Notebook Remotely
      • Generate SSH Key Pair
      • Permission Denied
      • Connection Refused
    • Advanced Tutorial
      • Labeling the data
      • Notebook as a Job
      • Custom build the Seldon server
      • PrimeHub SDK/CLI Tools
  • Administrator Guide
    • Admin Portal
      • Create User
      • Create Group
      • Assign Group Admin
      • Create/Plan Instance Type
      • Add InfuseAI Image
      • Add Image
      • Build Image
      • Gitsync Secret for GitHub
      • Pull Secret for GitLab
    • System Settings
    • User Management
    • Group Management
    • Instance Type Management
      • NodeSelector
      • Toleration
    • Image Management
      • Custom Image Guideline
    • Volume Management
      • Upload Server
    • Secret Management
    • App Settings
    • Notebooks Admin
    • Usage Reports
  • Reference
    • Jupyter Images
      • repo2docker image
      • RStudio image
    • InfuseAI Images List
    • Roadmap
  • Developer Guide
    • GitHub
    • Design
      • PrimeHub File System (PHFS)
      • PrimeHub Store
      • Log Persistence
      • PrimeHub Apps
      • Admission
      • Notebook with kernel process
      • JupyterHub
      • Image Builder
      • Volume Upload
      • Job Scheduler
      • Job Submission
      • Job Monitoring
      • Install Helper
      • User Portal
      • Meta Chart
      • PrimeHub Usage
      • Job Artifact
      • PrimeHub Apps
    • Concept
      • Architecture
      • Data Model
      • CRDs
      • GraphQL
      • Persistence Storages
      • Persistence
      • Resources Quota
      • Privilege
    • Configuration
      • How to configure PrimeHub
      • Multiple Jupyter Notebook Kernels
      • Configure SSH Server
      • Configure Job Submission
      • Configure Custom Image Build
      • Configure Model Deployment
      • Setup Self-Signed Certificate for PrimeHub
      • Chart Configuration
      • Configure PrimeHub Store
    • Environment Variables
Powered by GitBook
On this page
  • What is MLOps?
  • MLOps Pipeline Introduction
  • The scope of the showcase: Screw defect detection
  • Next Section
  1. End-to-End Tutorial

1 - MLOps Introduction and Scoping the Project

PreviousTiers and LicensesNext2 - Train and Manage the Model

Last updated 2 years ago

What is MLOps?

Machine Learning Operations, also called MLOps, is a process of training and deploying models into the production environment. Before we use the MLOps method to generate the model production service, we need to define several steps to prove that we can let our machine learning model into production.

MLOps Pipeline Introduction

In the MLOps pipeline, We implement the following stages:

1. Scoping

Plan and check that the project or product scope is suitable for Machine Learning Model. You can use the following form to evaluate your project.

The example table of project scope:

2. Data Engineering

Build the data processing pipeline, we can call it the DataOps pipeline. including the following method:

  • E: Data Extraction - Collect the data

  • T: Transform the data to be the dataset and label the dataset.

  • L: Load the dataset into a database or file system.

  • P: Check the data Profiling content. We can use the dashboard to show the data insight and keep modifying the Data ETL pipeline.

  • A: Check the data quality and Assertion status. We must ensure that the dataset quality can be assured before using it.

3. Model Engineering

Here are some steps to build up a modeling pipeline.

  1. Model Building: Initial the model pipeline and callback method.

  2. Model Training: Train the data by the machine learning or deep learning Model. At the same time, we need to define the measurement and track the model performance. For example, confusion matrix methods or accuracy metrics.

  3. Model Evaluation: Check the model is ready for deployment. For example, if the accuracy rate is bigger than 0.95, then we can change to the new model.

  4. Model Registry: Register the model and put the model into the production waiting line.

  5. Model Deployment: Use the Dockerize method to package the model and deploy it into the cloud or edge devices as endpoints. The package method could be an API server or serverless method.

  6. Model Monitoring: Monitor the model service performance, which contains two parts:

    1. Monitor the infrastructure where we deploy: E.g. load, usage, storage, and health.

    2. Monitor the model for its performance: Model and data performance that we need to retrain or not.

The scope of the showcase: Screw defect detection

In this demo showcase, we can use the project table to define the project scope and check the information:

Variable
Content

Project Name

Screw defect detection

Dataset

Screw dataset

Dataset Resource Link

Model

Image Classification Model

Code language

Python 3.7 + Tensorflow 2.5

Project Goal

Use the computer vision recognition method to analyze the screw images.

Measurement

Accuracy, Loss

Delivery

1. Metrics information 2. deployment service 3. Application service UI

Next Section

In the next part, we will use the PrimeHub Platform to label the model.

InfuseAI provides the open-source data assertion and profiling tool - Piperider to check the data quality toolkit for data professionals. Visit for the detail.

here
https://www.kaggle.com/datasets/ruruamour/screw-dataset
MLops pipeline
project scope table