Semi‑Automatic Annotation with Label Studio and YOLOv5: Installation, Project Setup, and Model Training
This guide explains how to combine the open‑source labeling platform Label Studio with the YOLOv5 object‑detection model to achieve semi‑automatic annotation, covering installation of both tools, project creation, dataset configuration, and training a custom YOLOv5 model on your own data.
Introduction
Object detection is a popular direction in machine learning, widely used in video surveillance, industrial inspection and many other fields. Training a high‑quality model usually requires a large amount of labeled data, and manual labeling is time‑consuming and prone to missed annotations when multiple objects appear in a single image. This article demonstrates how to combine the Label Studio annotation platform with the YOLOv5 detection model to achieve semi‑automatic labeling, reducing labor costs and improving labeling quality.
1. Overview
Label Studio
Label Studio is an open‑source data annotation platform that supports labeling of audio, text, image, video, and time‑series data, and can export annotations in formats required by various models. It allows multiple users to annotate simultaneously and can use a machine‑learning model as a backend for pre‑annotation.
YOLOv5
YOLO (You Only Look Once) is a grid‑based object detection algorithm where each cell predicts objects within its region. Due to its speed and accuracy, YOLO is one of the most famous detection algorithms. YOLOv5 is a family of pre‑trained detection architectures and models released on the COCO dataset.
2. Environment Installation and Configuration
Label Studio Installation
Label Studio can be installed via several methods; the pip method is used here:
# Requires Python >=3.7 <=3.9
pip install label-studio
# Start the server at http://localhost:8080
label-studioAfter installation, run label-studio in a terminal to start the service (default port 8080) and open the web interface.
YOLOv5 Installation
The YOLOv5 source code is available at https://github.com/ultralytics/yolov5 . Install it with:
git clone https://github.com/ultralytics/yolov5 # clone
cd yolov5
pip install -r requirements.txt # installLabel Studio ML Installation
Label Studio ML is an SDK that wraps machine‑learning code into a web server. Its repository is https://github.com/heartexlabs/label-studio-ml-backend . Install it as follows:
git clone https://github.com/heartexlabs/label-studio-ml-backend
cd label-studio-ml-backend
# Install label‑studio‑ml and its dependencies
pip install -U -e .
# Install example dependencies
pip install -r label_studio_ml/examples/requirements.txt3. Creating and Managing a Label Studio Project
Click Create to start a new project.
Enter the project name and description.
Upload the images to be annotated via Data Import .
Select Labeling Setup and choose the object‑detection template Object Detection .
4. Training Your Own YOLOv5 Model
The training script for YOLOv5 is shown below. The --data argument points to the dataset configuration, and --cfg specifies the model hyper‑parameters.
python train.py --data coco.yaml --cfg yolov5n.yaml --weights '' --batch-size 128
yolov5s 64
yolov5m 40
yolov5l 24
yolov5x 16Modifying coco.yaml
YOLOv5 supports several dataset formats; the COCO format is used as an example. Adjust the path to point to your dataset root, set train , val , and test to the appropriate label files, change nc to the number of classes in your dataset, and replace names with your class names.
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# COCO 2017 dataset http://cocodataset.org by Microsoft
# Example usage: python train.py --data coco.yaml
path: ../datasets/coco # dataset root dir
train: train2017.txt # 118287 images
val: val2017.txt # 5000 images
test: test-dev2017.txt
nc: 80 # number of classes
names: ['person', 'bicycle', 'car', ... 'toothbrush'] # class namesModifying yolov5n.yaml
Only the nc field needs to be changed to match the number of classes in your dataset.
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 80 # number of classes
depth_multiple: 0.33
width_multiple: 0.25
anchors:
- [10,13, 16,30, 33,23]
- [30,61, 62,45, 59,119]
- [116,90, 156,198, 373,326]
# YOLOv5 v6.0 backbone
backbone:
- [-1, 1, Conv, [64, 6, 2, 2]]
- [-1, 1, Conv, [128, 3, 2]]
- [-1, 3, C3, [128]]
- ...
# YOLOv5 v6.0 head
head:
- [-1, 1, Conv, [512, 1, 1]]
- ...
- [[17, 20, 23], 1, Detect, [nc, anchors]]Conclusion
At this point the labeling project has been created in Label Studio and a YOLOv5 model trained on your custom dataset. In the next article we will show how to inherit from label_studio_ml.model.LabelStudioMLBase to build a backend prediction service.
Recruitment
The Zero team of Zhengcai Cloud is looking for passionate engineers. The team, based in scenic Hangzhou, has over 300 R&D members from Alibaba, Huawei, NetEase, Zhejiang University, USTC, and Hangzhou Dianzi University, working on cloud‑native, blockchain, AI, low‑code platforms, middleware, big data, and more.
If you want to join a fast‑growing technical team that contributes to open‑source projects such as Flutter, scikit‑learn, Apache Dubbo, Apache RocketMQ, Apache Pulsar, CNCF Dapr, Apache DolphinScheduler, and Alibaba Seata, please send your resume to [email protected] .
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.