Artificial Intelligence 16 min read

YOLOv5 Tutorial: From YOLOv3 to YOLOv5, Code Walkthrough, Model Export (JIT & ONNX) and Usage

This article provides a comprehensive guide on YOLOv5, covering its background from YOLOv3, detailed code analysis of the model architecture, step‑by‑step instructions for running detect.py, configuring yolov5s.yaml, exporting the model to TorchScript JIT and ONNX formats, and practical inference examples using PyTorch and ONNX Runtime.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
YOLOv5 Tutorial: From YOLOv3 to YOLOv5, Code Walkthrough, Model Export (JIT & ONNX) and Usage

Introduction – The article starts with a brief history of the YOLO family, noting that YOLOv3 was the last official Darknet release before YOLOv4, and introduces Ultralytics' YOLOv5 as a PyTorch‑based implementation that combines features from YOLOv3‑SPP and YOLOv4.

YOLOv5 Overview – Ultralytics' contribution to the YOLO community is highlighted, including performance figures (e.g., YOLOv5‑x achieving 47.2 AP at 63 FPS on 736×736 images) and the lack of an official paper, prompting readers to explore the source code for details.

Code Structure – The repository layout is shown, with emphasis on the detect.py script, the models/yolo.py file, and the models/yolov5s.yaml configuration. Images of the directory tree and network diagrams are included.

Running detect.py – To run the detector, download yolov5s.pt into the weights folder and execute detect.py . The first inference produces an output directory with detection results.

Network Architecture (models/yolo.py) – The article explains how to visualise the model with Netron after installing it via pip install netron . Sample code used for saving the model is:

model = Model(opt.cfg).to(device)
torch.save(model, "m.pt")

Exporting to TorchScript JIT format is demonstrated with:

model = Model("models/yolov5s.yaml")
model.load_state_dict(torch.load("my_yolov5s.pt"))
model.eval()
script_model = torch.jit.trace(model, img_tensor[None])
script_model.save("my_yolov5s.jit")

Configuration File (yolov5s.yaml) – The YAML defines parameters such as nc , depth_multiple , width_multiple , anchor boxes, backbone layers, and head layers. The article explains the meaning of entries like [-1, 1, Focus, [64, 3]] and the scaling rules r²βw<2 used for depth and width scaling.

Key Modules

• Conv – Custom convolution block (Conv‑BN‑Activation) mirroring YOLOv4's CBL structure.

class Conv(nn.Module):
    def __init__(self, c1, c2, k=1, s=1, g=1, act=True):
        p = k // 2 if isinstance(k, int) else [x // 2 for x in k]
        self.conv = nn.Conv2d(c1, c2, k, s, p, groups=g, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = nn.LeakyReLU(0.1, inplace=True) if act else nn.Identity()
    def forward(self, x):
        return self.act(self.bn(self.conv(x)))

• Focus – Reduces spatial resolution while increasing channel depth by concatenating four sliced versions of the input.

class Focus(nn.Module):
    def __init__(self, c1, c2, k=1):
        super(Focus, self).__init__()
        self.conv = Conv(c1 * 4, c2, k, 1)
    def forward(self, x):
        return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2],
                                   x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))

• BottleneckCSP – Implements Cross‑Stage Partial (CSP) bottleneck with optional shortcut connections.

class BottleneckCSP(nn.Module):
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        c_ = int(c2 * e)
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
        self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
        self.cv4 = Conv(c2, c2, 1, 1)
        self.bn = nn.BatchNorm2d(2 * c_)
        self.act = nn.LeakyReLU(0.1, inplace=True)
        self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)])
    def forward(self, x):
        y1 = self.cv3(self.m(self.cv1(x)))
        y2 = self.cv2(x)
        return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1))))

Training Script (train.py) – The article shows how to load the pretrained model via torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True) , save custom weights, and run inference on an example image ( zidane.jpg ) using standard PyTorch transforms and a non‑max suppression utility.

Exporting to ONNX – After setting model.model[-1].export = True , the model is exported with:

torch.onnx.export(model, img_tensor[None], "my_yolov5s.onnx", verbose=True,
                    opset_version=11, input_names=['images'],
                    output_names=['output1','output2','output3'])

ONNX Runtime usage is demonstrated by installing onnxruntime , creating an inference session, and processing the same image to obtain detection boxes.

Conclusion – The guide equips readers with the knowledge to run YOLOv5, understand its architecture, modify configuration files, and export the model to formats suitable for deployment on various platforms.

computer visionObject DetectionJITPyTorchYOLOv5ONNXmodel export
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.