Artificial Intelligence 10 min read

Understanding ResNet and Building It from Scratch with PyTorch

This article explains the motivation behind residual networks, describes the architecture of ResNet including residual blocks and skip connections, lists available Keras implementations, and provides a step‑by‑step PyTorch tutorial with complete code to construct and test ResNet‑50/101/152 models.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Understanding ResNet and Building It from Scratch with PyTorch

In recent years, breakthroughs in deep learning and computer vision have been driven by very deep convolutional neural networks, which achieve excellent results on image recognition and classification tasks.

As networks become deeper, training becomes harder, accuracy saturates, and may even degrade. Residual Networks (ResNet) were introduced to alleviate this problem by using residual (skip) connections.

What is ResNet?

ResNet, introduced in the 2015 paper "Deep Residual Learning for Image Recognition," is one of the most successful deep learning models. Its core component is the residual block, which adds a shortcut connection that bypasses one or more layers.

The shortcut connection changes the mapping from H(x)=f(x) to H(x)=f(x)+x, allowing gradients to flow more easily and mitigating the vanishing‑gradient problem. When input and output dimensions differ, the shortcut can be padded with zeros or a 1×1 convolution can be used to match dimensions.

ResNet Architecture

Typical ResNet architectures (e.g., ResNet‑34, ResNet‑50, ResNet‑101, ResNet‑152) consist of an initial convolutional layer followed by four stages of residual blocks. Each stage may down‑sample the feature map using stride‑2 convolutions.

Below are example diagrams from the original paper (images omitted for brevity).

ResNet in Keras

Keras provides several pretrained ResNet variants via tf.keras.applications :

ResNet50

ResNet50V2

ResNet101

ResNet101V2

ResNet152

ResNet152V2

Building ResNet from Scratch with PyTorch

The following PyTorch code defines a generic residual block and the full ResNet class, then creates specific functions for ResNet‑50, ResNet‑101, and ResNet‑152.

<code>import torch
import torch.nn as nn</code>
<code>class block(nn.Module):
    def __init__(self, in_channels, intermediate_channels, identity_downsample=None, stride=1):
        super(block, self).__init__()
        self.expansion = 4
        self.conv1 = nn.Conv2d(in_channels, intermediate_channels, kernel_size=1, stride=1, padding=0, bias=False)
        self.bn1 = nn.BatchNorm2d(intermediate_channels)
        self.conv2 = nn.Conv2d(intermediate_channels, intermediate_channels, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(intermediate_channels)
        self.conv3 = nn.Conv2d(intermediate_channels, intermediate_channels * self.expansion, kernel_size=1, stride=1, padding=0, bias=False)
        self.bn3 = nn.BatchNorm2d(intermediate_channels * self.expansion)
        self.relu = nn.ReLU()
        self.identity_downsample = identity_downsample
        self.stride = stride
    def forward(self, x):
        identity = x.clone()
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.conv2(x)
        x = self.bn2(x)
        x = self.relu(x)
        x = self.conv3(x)
        x = self.bn3(x)
        if self.identity_downsample is not None:
            identity = self.identity_downsample(identity)
        x += identity
        x = self.relu(x)
        return x</code>
<code>class ResNet(nn.Module):
    def __init__(self, block, layers, image_channels, num_classes):
        super(ResNet, self).__init__()
        self.in_channels = 64
        self.conv1 = nn.Conv2d(image_channels, 64, kernel_size=7, stride=2, padding=3, bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.relu = nn.ReLU()
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, layers[0], intermediate_channels=64, stride=1)
        self.layer2 = self._make_layer(block, layers[1], intermediate_channels=128, stride=2)
        self.layer3 = self._make_layer(block, layers[2], intermediate_channels=256, stride=2)
        self.layer4 = self._make_layer(block, layers[3], intermediate_channels=512, stride=2)
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(512 * 4, num_classes)
    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        x = self.avgpool(x)
        x = x.reshape(x.shape[0], -1)
        x = self.fc(x)
        return x
    def _make_layer(self, block, num_residual_blocks, intermediate_channels, stride):
        identity_downsample = None
        layers = []
        if stride != 1 or self.in_channels != intermediate_channels * 4:
            identity_downsample = nn.Sequential(
                nn.Conv2d(self.in_channels, intermediate_channels * 4, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(intermediate_channels * 4),
            )
        layers.append(block(self.in_channels, intermediate_channels, identity_downsample, stride))
        self.in_channels = intermediate_channels * 4
        for i in range(num_residual_blocks - 1):
            layers.append(block(self.in_channels, intermediate_channels))
        return nn.Sequential(*layers)</code>
<code>def ResNet50(img_channel=3, num_classes=1000):
    return ResNet(block, [3, 4, 6, 3], img_channel, num_classes)

def ResNet101(img_channel=3, num_classes=1000):
    return ResNet(block, [3, 4, 23, 3], img_channel, num_classes)

def ResNet152(img_channel=3, num_classes=1000):
    return ResNet(block, [3, 8, 36, 3], img_channel, num_classes)</code>
<code>def test():
    net = ResNet101(img_channel=3, num_classes=1000)
    device = "cuda" if torch.cuda.is_available() else "cpu"
    y = net(torch.randn(4, 3, 224, 224)).to(device)
    print(y.size())

test()</code>

Running the test prints a tensor of shape (4, 1000) , confirming that the model works.

Original source: Analytics Vidhya – Build ResNet from Scratch with Python

CNNDeep LearningPyTorchResNetResidual Networks
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.