Understanding ResNet and Building It from Scratch with PyTorch
This article explains the motivation behind residual networks, describes the architecture of ResNet including residual blocks and skip connections, lists available Keras implementations, and provides a step‑by‑step PyTorch tutorial with complete code to construct and test ResNet‑50/101/152 models.
In recent years, breakthroughs in deep learning and computer vision have been driven by very deep convolutional neural networks, which achieve excellent results on image recognition and classification tasks.
As networks become deeper, training becomes harder, accuracy saturates, and may even degrade. Residual Networks (ResNet) were introduced to alleviate this problem by using residual (skip) connections.
What is ResNet?
ResNet, introduced in the 2015 paper "Deep Residual Learning for Image Recognition," is one of the most successful deep learning models. Its core component is the residual block, which adds a shortcut connection that bypasses one or more layers.
The shortcut connection changes the mapping from H(x)=f(x) to H(x)=f(x)+x, allowing gradients to flow more easily and mitigating the vanishing‑gradient problem. When input and output dimensions differ, the shortcut can be padded with zeros or a 1×1 convolution can be used to match dimensions.
ResNet Architecture
Typical ResNet architectures (e.g., ResNet‑34, ResNet‑50, ResNet‑101, ResNet‑152) consist of an initial convolutional layer followed by four stages of residual blocks. Each stage may down‑sample the feature map using stride‑2 convolutions.
Below are example diagrams from the original paper (images omitted for brevity).
ResNet in Keras
Keras provides several pretrained ResNet variants via tf.keras.applications :
ResNet50
ResNet50V2
ResNet101
ResNet101V2
ResNet152
ResNet152V2
Building ResNet from Scratch with PyTorch
The following PyTorch code defines a generic residual block and the full ResNet class, then creates specific functions for ResNet‑50, ResNet‑101, and ResNet‑152.
<code>import torch
import torch.nn as nn</code> <code>class block(nn.Module):
def __init__(self, in_channels, intermediate_channels, identity_downsample=None, stride=1):
super(block, self).__init__()
self.expansion = 4
self.conv1 = nn.Conv2d(in_channels, intermediate_channels, kernel_size=1, stride=1, padding=0, bias=False)
self.bn1 = nn.BatchNorm2d(intermediate_channels)
self.conv2 = nn.Conv2d(intermediate_channels, intermediate_channels, kernel_size=3, stride=stride, padding=1, bias=False)
self.bn2 = nn.BatchNorm2d(intermediate_channels)
self.conv3 = nn.Conv2d(intermediate_channels, intermediate_channels * self.expansion, kernel_size=1, stride=1, padding=0, bias=False)
self.bn3 = nn.BatchNorm2d(intermediate_channels * self.expansion)
self.relu = nn.ReLU()
self.identity_downsample = identity_downsample
self.stride = stride
def forward(self, x):
identity = x.clone()
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.conv2(x)
x = self.bn2(x)
x = self.relu(x)
x = self.conv3(x)
x = self.bn3(x)
if self.identity_downsample is not None:
identity = self.identity_downsample(identity)
x += identity
x = self.relu(x)
return x</code> <code>class ResNet(nn.Module):
def __init__(self, block, layers, image_channels, num_classes):
super(ResNet, self).__init__()
self.in_channels = 64
self.conv1 = nn.Conv2d(image_channels, 64, kernel_size=7, stride=2, padding=3, bias=False)
self.bn1 = nn.BatchNorm2d(64)
self.relu = nn.ReLU()
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, layers[0], intermediate_channels=64, stride=1)
self.layer2 = self._make_layer(block, layers[1], intermediate_channels=128, stride=2)
self.layer3 = self._make_layer(block, layers[2], intermediate_channels=256, stride=2)
self.layer4 = self._make_layer(block, layers[3], intermediate_channels=512, stride=2)
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Linear(512 * 4, num_classes)
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.avgpool(x)
x = x.reshape(x.shape[0], -1)
x = self.fc(x)
return x
def _make_layer(self, block, num_residual_blocks, intermediate_channels, stride):
identity_downsample = None
layers = []
if stride != 1 or self.in_channels != intermediate_channels * 4:
identity_downsample = nn.Sequential(
nn.Conv2d(self.in_channels, intermediate_channels * 4, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(intermediate_channels * 4),
)
layers.append(block(self.in_channels, intermediate_channels, identity_downsample, stride))
self.in_channels = intermediate_channels * 4
for i in range(num_residual_blocks - 1):
layers.append(block(self.in_channels, intermediate_channels))
return nn.Sequential(*layers)</code> <code>def ResNet50(img_channel=3, num_classes=1000):
return ResNet(block, [3, 4, 6, 3], img_channel, num_classes)
def ResNet101(img_channel=3, num_classes=1000):
return ResNet(block, [3, 4, 23, 3], img_channel, num_classes)
def ResNet152(img_channel=3, num_classes=1000):
return ResNet(block, [3, 8, 36, 3], img_channel, num_classes)</code> <code>def test():
net = ResNet101(img_channel=3, num_classes=1000)
device = "cuda" if torch.cuda.is_available() else "cpu"
y = net(torch.randn(4, 3, 224, 224)).to(device)
print(y.size())
test()</code>Running the test prints a tensor of shape (4, 1000) , confirming that the model works.
Original source: Analytics Vidhya – Build ResNet from Scratch with Python
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.