PrimeHub
  • Introduction
  • Installation
  • Tiers and Licenses
  • End-to-End Tutorial
    • 1 - MLOps Introduction and Scoping the Project
    • 2 - Train and Manage the Model
    • 3 - Compare, Register and Deploy the Model
    • 4 - Build the Web Application
    • 5 - Summary
  • User Guide
    • User Portal
    • Notebook
      • Notebook Tips
      • Advanced Settings
      • PrimeHub Notebook Extension
      • Submit Notebook as Job
    • Jobs
      • Job Artifacts
      • Tutorial
        • (Part1) MNIST classifier training
        • (Part2) MNIST classifier training
        • (Advanced) Use Job Submission to Tune Hyperparameters
        • (Advanced) Model Serving by Seldon
        • Job Artifacts Simple Usecase
    • Models
      • Manage and Deploy Model
      • Model Management Configuration
    • Deployments
      • Pre-packaged servers
        • TensorFlow server
        • PyTorch server
        • SKLearn server
        • Customize Pre-packaged Server
        • Run Pre-packaged Server Locally
      • Package from Language Wrapper
        • Model Image for Python
        • Model Image for R
        • Reusable Base Image
      • Prediction APIs
      • Model URI
      • Tutorial
        • Model by Pre-packaged Server
        • Model by Pre-packaged Server (PHFS)
        • Model by Image built from Language Wrapper
    • Shared Files
    • Datasets
    • Apps
      • Label Studio
      • MATLAB
      • MLflow
      • Streamlit
      • Tutorial
        • Create Your Own App
        • Create an MLflow server
        • Label Dataset by Label Studio
        • Code Server
    • Group Admin
      • Images
      • Settings
    • Generate an PrimeHub API Token
    • Python SDK
    • SSH Server Feature
      • VSCode SSH Notebook Remotely
      • Generate SSH Key Pair
      • Permission Denied
      • Connection Refused
    • Advanced Tutorial
      • Labeling the data
      • Notebook as a Job
      • Custom build the Seldon server
      • PrimeHub SDK/CLI Tools
  • Administrator Guide
    • Admin Portal
      • Create User
      • Create Group
      • Assign Group Admin
      • Create/Plan Instance Type
      • Add InfuseAI Image
      • Add Image
      • Build Image
      • Gitsync Secret for GitHub
      • Pull Secret for GitLab
    • System Settings
    • User Management
    • Group Management
    • Instance Type Management
      • NodeSelector
      • Toleration
    • Image Management
      • Custom Image Guideline
    • Volume Management
      • Upload Server
    • Secret Management
    • App Settings
    • Notebooks Admin
    • Usage Reports
  • Reference
    • Jupyter Images
      • repo2docker image
      • RStudio image
    • InfuseAI Images List
    • Roadmap
  • Developer Guide
    • GitHub
    • Design
      • PrimeHub File System (PHFS)
      • PrimeHub Store
      • Log Persistence
      • PrimeHub Apps
      • Admission
      • Notebook with kernel process
      • JupyterHub
      • Image Builder
      • Volume Upload
      • Job Scheduler
      • Job Submission
      • Job Monitoring
      • Install Helper
      • User Portal
      • Meta Chart
      • PrimeHub Usage
      • Job Artifact
      • PrimeHub Apps
    • Concept
      • Architecture
      • Data Model
      • CRDs
      • GraphQL
      • Persistence Storages
      • Persistence
      • Resources Quota
      • Privilege
    • Configuration
      • How to configure PrimeHub
      • Multiple Jupyter Notebook Kernels
      • Configure SSH Server
      • Configure Job Submission
      • Configure Custom Image Build
      • Configure Model Deployment
      • Setup Self-Signed Certificate for PrimeHub
      • Chart Configuration
      • Configure PrimeHub Store
    • Environment Variables
Powered by GitBook
On this page
  • Introduction
  • What is MLFlow?
  • Step 1: Create the MLflow Server
  • Step 2: Training the model
  • Step 3: Use MLFlow to manage the model
  • Step 4: Change the parameter for tuning the model
  • Conclusion
  • Next Section
  1. End-to-End Tutorial

2 - Train and Manage the Model

Previous1 - MLOps Introduction and Scoping the ProjectNext3 - Compare, Register and Deploy the Model

Last updated 2 years ago

Introduction

In this tutorial, we will do the following.

  1. Train our model using the labeled data.

  2. Use MLFLow to manage the model parameters, metrics, and artifact files.

What is MLFlow?

MLFlow is an open-source platform to manage the machine learning model, including training parameters, metrics, and artifacts. All the data scientists can check the information in the platform and know how to improve the machine learning model. Find out more at the following links.

The PrimeHub platform integrates MLFlow as the model management function. You can see the result on the page. You can see the detailed information here:

Step 1: Create the MLflow Server

To track our experiments, we must first install MLflow, which is available as part of PrimeHub Apps. Use the guide at the following link to install MLflow:

Step 2: Training the model

1. Create Jupyter Notebook

PrimeHub UI → User portal → Notebook → Choose the instance type and Jupyter image → Start the Notebook

Variable
Value

Instance Type

CPU 2

Image

Tensorflow 2.5

When the notebook has started, you will see the My Server Information page:

2. Download the tutorial project code

Jupyter Notebook → Create terminal → Run the following commands

$ cd <group-name>
$ git clone https://github.com/InfuseAI/primehub-screw-detection.git
$ cd primehub-screw-detection

Open the Notebook and modify the variables

# Kaggle connection
kaggle_username = <kaggle-username>
kaggle_key = <kaggle-key>

# Label Data file
label_data_file_path = "project-6-at-2022-09-19-04-17-b9f72b54.json"

3. Run the Notebook and train the model

Step 3: Use MLFlow to manage the model

In the training code, we write the following code. MLFlow can help us record the parameters, metrics, and artifact files into the platform. We can manage the model via the MLFlow platform.

import mlflow
mlflow.set_experiment("tutorial_screw_train")
mlflow.tensorflow.autolog()

As a result, we can go back to the PrimeHub user portal, open the MLFlow platform and see the model result in the MLFlow server.

→ PrimeHub User portal → models → MLFlow UI → (In MLFlow) Experiments → tutorial_screw_train

Step 4: Change the parameter for tuning the model

When we successfully build the machine learning model, we need to give experiments for tuning the machine learning model to be the best.

  1. Change the variable value and run the Notebook again:

    Please change the base_learning_rate variable value and run the Notebook again:

    base_learning_rate = 0.01 → base_learning_rate = 0.05

  2. Then you will see the second experiment result in MLFlow platform.

Conclusion

In this tutorial we have done the following.

  1. Installed MLflow in PrimeHub Apps

  2. Connected the MLFlow to PrimeHub in the PrimeHub settings

  3. Trained our model in PrimeHub Notebook

  4. Check the modal management content in the MLFlow Platform.

  5. Use different parameters for tuning the model.

Next Section

In the next tutorial, we will analyze the two sets of results, manage the trained models, and deploy the best model to the cloud.

More information about obtaining your Kaggle username and key can be found in the

Note: If you want to run the training Notebook in the background job, we also support submitting the Jupyter Notebook as a job method. Please see the section for details.

MLflow Website
MLflow Documentation
PrimeHub model
Create an MLflow server
Kaggle API documentation
advanced tutorial