PrimeHub
  • Introduction
  • Installation
  • Tiers and Licenses
  • End-to-End Tutorial
    • 1 - MLOps Introduction and Scoping the Project
    • 2 - Train and Manage the Model
    • 3 - Compare, Register and Deploy the Model
    • 4 - Build the Web Application
    • 5 - Summary
  • User Guide
    • User Portal
    • Notebook
      • Notebook Tips
      • Advanced Settings
      • PrimeHub Notebook Extension
      • Submit Notebook as Job
    • Jobs
      • Job Artifacts
      • Tutorial
        • (Part1) MNIST classifier training
        • (Part2) MNIST classifier training
        • (Advanced) Use Job Submission to Tune Hyperparameters
        • (Advanced) Model Serving by Seldon
        • Job Artifacts Simple Usecase
    • Models
      • Manage and Deploy Model
      • Model Management Configuration
    • Deployments
      • Pre-packaged servers
        • TensorFlow server
        • PyTorch server
        • SKLearn server
        • Customize Pre-packaged Server
        • Run Pre-packaged Server Locally
      • Package from Language Wrapper
        • Model Image for Python
        • Model Image for R
        • Reusable Base Image
      • Prediction APIs
      • Model URI
      • Tutorial
        • Model by Pre-packaged Server
        • Model by Pre-packaged Server (PHFS)
        • Model by Image built from Language Wrapper
    • Shared Files
    • Datasets
    • Apps
      • Label Studio
      • MATLAB
      • MLflow
      • Streamlit
      • Tutorial
        • Create Your Own App
        • Create an MLflow server
        • Label Dataset by Label Studio
        • Code Server
    • Group Admin
      • Images
      • Settings
    • Generate an PrimeHub API Token
    • Python SDK
    • SSH Server Feature
      • VSCode SSH Notebook Remotely
      • Generate SSH Key Pair
      • Permission Denied
      • Connection Refused
    • Advanced Tutorial
      • Labeling the data
      • Notebook as a Job
      • Custom build the Seldon server
      • PrimeHub SDK/CLI Tools
  • Administrator Guide
    • Admin Portal
      • Create User
      • Create Group
      • Assign Group Admin
      • Create/Plan Instance Type
      • Add InfuseAI Image
      • Add Image
      • Build Image
      • Gitsync Secret for GitHub
      • Pull Secret for GitLab
    • System Settings
    • User Management
    • Group Management
    • Instance Type Management
      • NodeSelector
      • Toleration
    • Image Management
      • Custom Image Guideline
    • Volume Management
      • Upload Server
    • Secret Management
    • App Settings
    • Notebooks Admin
    • Usage Reports
  • Reference
    • Jupyter Images
      • repo2docker image
      • RStudio image
    • InfuseAI Images List
    • Roadmap
  • Developer Guide
    • GitHub
    • Design
      • PrimeHub File System (PHFS)
      • PrimeHub Store
      • Log Persistence
      • PrimeHub Apps
      • Admission
      • Notebook with kernel process
      • JupyterHub
      • Image Builder
      • Volume Upload
      • Job Scheduler
      • Job Submission
      • Job Monitoring
      • Install Helper
      • User Portal
      • Meta Chart
      • PrimeHub Usage
      • Job Artifact
      • PrimeHub Apps
    • Concept
      • Architecture
      • Data Model
      • CRDs
      • GraphQL
      • Persistence Storages
      • Persistence
      • Resources Quota
      • Privilege
    • Configuration
      • How to configure PrimeHub
      • Multiple Jupyter Notebook Kernels
      • Configure SSH Server
      • Configure Job Submission
      • Configure Custom Image Build
      • Configure Model Deployment
      • Setup Self-Signed Certificate for PrimeHub
      • Chart Configuration
      • Configure PrimeHub Store
    • Environment Variables
Powered by GitBook
On this page
  • Introduction
  • What is Label Studio?
  • Prerequisites
  • End-to-end Steps
  • Conclusion
  • Next Section
  1. User Guide
  2. Advanced Tutorial

Labeling the data

PreviousAdvanced TutorialNextNotebook as a Job

Last updated 2 years ago

Introduction

The aim of the model is to detect a good or bad example of a screw from a given photograph. In this section, you will label photographs of screws that will be used to train your model.

Data labeling is a critical part of model development for machine learning. By using a well-defined dataset, data scientists can train effective models.

In this tutorial, we will:

  1. Using , an open-source data labeling tool, to label data and train a model. Label Studio is available as part of , a convenient way to integrate 3rd-party apps into your ML Workflow.

  2. Configure the Label Studio project setting and label the images.

  3. Export the labeling Json-format output file.

What is Label Studio?

Label Studio is an open-source data labeling web platform. You can label the images, videos, texts, and audio to do your machine learning. It is convenient and easy to annotate the data files for the users. You can see more detail on the Label Studio Website and the documentation.

We can easily start the label studio platform in the PrimeHub platform. You can follow the document to learn how to in PrimeHub Apps.

Prerequisites

  1. Create a group and for storage:

  2. Download the and unzip the file.

End-to-end Steps

Step 1: Create a Label Studio PrimeHub App.

PrimeHub user portal → Apps → + Application → Label Studio → Fill in the information:

Variable
Value

Name

label studio - screw

InstanceTypes

CPU 1

Step 2: Log into Label Studio.

Tip: The account and password are in the environment variables.

Step 3: Create a Label Studio project.

Click Create → Fill in the information:

Variable
Value

Project name

Screw Defect Detection

Data Import

Upload the screw image.

Labeling Setup

Custom Template → Fill in the code below.

    <View>
        <Image name="image" value="$image"/>
        <Choices name="choice" toName="image">
            <Choice value="good"/>
            <Choice value="bad"/>
        </Choices>
    </View>

Step 4: Label the images

In the screw project, click Label All Tasks to start labeling.

For each image, click either the good or bad checkbox, use the keyboard numbers 1 for good or 2 for bad, and then click the Submit button to proceed to the following image.

Step 5: Export the labeling output.

You can use the export UI to download the JSON-formatted files to your local computer.

Conclusion

In this tutorial, we have enabled a group volume, installed Label Studio via PrimeHub Apps, and labeled a set of images. Using the labeled dataset, we can move on to the next step.

Next Section

In the following tutorial, we will create a notebook to train the screw classification model and manage the model via MLFlow.

Label Studio
PrimeHub Apps
Label Studio website
Label Studio Documentation
create the label studio
images ZIP file
enable a shared volume