Artificial Intelligence 15 min read

Build a Handwritten Digit Classifier with PyTorch: Step‑by‑Step Guide

This tutorial walks you through building a digit classifier using PyTorch and the MNIST dataset, covering environment setup, data loading, model construction, training, evaluation, and model persistence while explaining core deep‑learning concepts.

Code Mala Tang
Code Mala Tang
Code Mala Tang
Build a Handwritten Digit Classifier with PyTorch: Step‑by‑Step Guide

If you've ever wondered how to build and train deep learning models, PyTorch is one of the friendliest and most powerful frameworks you can use. In this article we will build a project and dive into PyTorch concepts, explaining the code line by line.

The project is a digit classifier using the famous MNIST dataset, training a simple neural network to recognize handwritten digits (0‑9). By the end you will understand PyTorch's core ideas and how they combine to make deep learning simple and efficient.

Why Choose PyTorch?

Before starting the project, let's discuss why PyTorch has become the framework of choice for many researchers and developers:

Dynamic Computation Graph : Unlike early versions of TensorFlow, PyTorch builds the computation graph at runtime, allowing you to debug models with standard Python tools.

Pythonic Syntax : PyTorch integrates seamlessly with Python, making learning and usage easy for anyone familiar with Python.

Versatility : PyTorch supports tasks from small‑scale prototyping to large‑scale production and handles both CPU and GPU.

Strong Community : As an open‑source project, its active community provides countless libraries and tutorials to help you get started.

Project: Digit Classification

In this project we will:

Build a simple feed‑forward neural network.

Train it on the MNIST dataset to classify handwritten digits.

Test the model on unseen data.

Save the trained model for reuse.

We will cover the whole process, from loading data to optimizing the model.

Step 1: Set Up the Environment

First, install PyTorch and torchvision (for handling the dataset) using pip:

<code>pip install torch torchvision</code>

Verify the installation:

<code>import torch
print(torch.__version__)</code>

If no errors occur, you are ready to go! :)

Step 2: Understand the Dataset

The MNIST dataset contains:

60,000 training images and 10,000 test images .

Each image is 28×28 pixels representing a handwritten digit.

Images are grayscale (single channel).

Labels are integers from 0 to 9.

Step 3: Load the MNIST Dataset

PyTorch provides the torchvision library for loading and preprocessing data.

Key Concepts

Transforms : Pre‑processing steps applied to data (e.g., normalization, resizing).

DataLoader : A PyTorch class that batches and shuffles data to improve training efficiency.

Let's load and preprocess the MNIST dataset:

<code>from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Transform: convert images to tensors and normalize pixel values
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))  # normalize to (-1, 1)
])

train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True)
test_dataset = datasets.MNIST(root='./data', train=False, transform=transform, download=True)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)
</code>

transforms.ToTensor() converts an image to a PyTorch tensor of shape (1, 28, 28) and scales pixel values to [0, 1].

transforms.Normalize((0.5,), (0.5,)) shifts the pixel values to [-1, 1] by subtracting the mean (0.5) and dividing by the standard deviation (0.5).

DataLoader automatically batches (64 images per batch) and shuffles the training set to prevent overfitting.

Step 4: Build the Neural Network

The network processes data through multiple interconnected neurons. The architecture:

Input layer : Receives the flattened 28×28 input (784 values).

Hidden layer : Intermediate layer with weights, biases, and activation functions.

Output layer : Produces probabilities for 10 classes (digits 0‑9).

Define the model in PyTorch:

<code>import torch.nn as nn

class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(28 * 28, 128)   # input to hidden
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(128, 64)       # hidden to hidden
        self.fc3 = nn.Linear(64, 10)        # hidden to output
        self.softmax = nn.Softmax(dim=1)    # convert output to probabilities

    def forward(self, x):
        x = x.view(-1, 28 * 28)            # flatten 28x28 image
        x = self.relu(self.fc1(x))
        x = self.relu(self.fc2(x))
        x = self.fc3(x)
        return self.softmax(x)
</code>

nn.Linear implements a fully‑connected layer (y = Wx + b). ReLU introduces non‑linearity, and Softmax converts the output to a probability distribution.

Step 5: Train the Model

Training requires a loss function and an optimizer:

Loss function : Measures the gap between predictions and true labels. We use

CrossEntropyLoss

, common for classification.

Optimizer : Updates model weights to minimize loss. We use

Adam

, which combines the benefits of SGD with adaptive learning rates.

Training loop:

<code>import torch.optim as optim

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SimpleNN().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
epochs = 5

for epoch in range(epochs):
    model.train()
    running_loss = 0.0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    print(f"Epoch [{epoch+1}/{epochs}], Loss: {running_loss/len(train_loader):.4f}")
</code>

The core steps are forward propagation, loss computation, back‑propagation, and weight update.

Step 6: Test the Model

Evaluation on unseen data:

<code>model.eval()
correct = 0
total = 0
with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
print(f"Test Accuracy: {100 * correct / total:.2f}%")
</code>

Step 7: Save and Load the Model

Save the trained model for later use:

<code>torch.save(model.state_dict(), 'simple_nn.pth')
</code>

Load the saved model:

<code>model.load_state_dict(torch.load('simple_nn.pth'))
model.eval()
</code>

Understanding Core Concepts

Tensors : PyTorch’s data structure, similar to NumPy arrays but optimized for GPU.

Autograd : Automatic differentiation for back‑propagation.

Modules : Reusable components such as layers (

nn.Linear

) and activation functions (

nn.ReLU

).

DataLoader : Efficient batching and shuffling of data.

Device Management : Seamless switching between CPU and GPU.

Conclusion

This project covers the fundamental steps to understand PyTorch. By building this simple digit classifier you have learned how to:

Load and preprocess data.

Build and train a neural network.

Evaluate and save the model.

Mastering these basics provides a solid foundation for exploring more advanced topics such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), or Transformers.

Neural NetworkPythonDeep LearningPyTorchMNIST
Code Mala Tang
Written by

Code Mala Tang

Read source code together, write articles together, and enjoy spicy hot pot together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.