Artificial Intelligence 28 min read

Background Replacement Using Deep Learning in Python

This article presents a Python implementation that leverages deep learning models such as MobileNetV2 and ResNet to replace image backgrounds, detailing project structure, required dependencies, dataset preparation, core code for homographic alignment, model definition, and usage instructions for processing single images with custom backgrounds.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Background Replacement Using Deep Learning in Python

Project Overview

The repository provides a Python project for replacing the background of a portrait image using deep learning models. It includes model files, a requirements list, and scripts for loading data, performing homographic alignment, and refining the matte.

Project Structure

The model directory stores pretrained model weights (download link provided). The requirements.txt lists necessary packages such as kornia==0.4.1 , tensorboard==2.3.0 , torch==1.7.0 , torchvision==0.8.1 , tqdm==4.51.0 , opencv-python==4.4.0.44 , onnxruntime==1.6.0 and others.

Data Preparation

Users should supply a source image, its background image, and a new background image. Example images are shown in the original article.

Core Background‑Replacement Code

<code>import argparse<br>import torch<br>import os<br><br>from torch.nn import functional as F<br>from torch.utils.data import DataLoader<br>from torchvision import transforms as T<br>from torchvision.transforms.functional import to_pil_image<br>from threading import Thread<br>from tqdm import tqdm<br>from torch.utils.data import Dataset<br>from PIL import Image<br>from typing import Callable, Optional, List, Tuple<br>import glob<br>from torch import nn<br>from torchvision.models.resnet import ResNet, Bottleneck<br>from torch import Tensor<br>import torchvision<br>import numpy as np<br>import cv2<br>import uuid<br><br># --------------- hy ---------------<br>class HomographicAlignment:<br>    """Apply homographic alignment on background to match with the source image."""<br>    def __init__(self):<br>        self.detector = cv2.ORB_create()<br>        self.matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE)<br><br>    def __call__(self, src, bgr):<br>        src = np.asarray(src)<br>        bgr = np.asarray(bgr)<br><br>        keypoints_src, descriptors_src = self.detector.detectAndCompute(src, None)<br>        keypoints_bgr, descriptors_bgr = self.detector.detectAndCompute(bgr, None)<br><br>        matches = self.matcher.match(descriptors_bgr, descriptors_src, None)<br>        matches.sort(key=lambda x: x.distance, reverse=False)<br>        num_good_matches = int(len(matches) * 0.15)<br>        matches = matches[:num_good_matches]<br><br>        points_src = np.zeros((len(matches), 2), dtype=np.float32)<br>        points_bgr = np.zeros((len(matches), 2), dtype=np.float32)<br>        for i, match in enumerate(matches):<br>            points_src[i, :] = keypoints_src[match.trainIdx].pt<br>            points_bgr[i, :] = keypoints_bgr[match.queryIdx].pt<br><br>        H, _ = cv2.findHomography(points_bgr, points_src, cv2.RANSAC)<br><br>        h, w = src.shape[:2]<br>        bgr = cv2.warpPerspective(bgr, H, (w, h))<br>        msk = cv2.warpPerspective(np.ones((h, w)), H, (w, h))<br><br>        # For areas that is outside of the background, copy pixels from the source.<br>        bgr[msk != 1] = src[msk != 1]<br><br>        src = Image.fromarray(src)<br>        bgr = Image.fromarray(bgr)<br><br>        return src, bgr<br><br># Refiner and other model classes follow (omitted for brevity)</code>

The script defines a HomographicAlignment class that aligns the new background to the source image using ORB features, and a Refiner module that refines the matte with configurable modes (full, sampling, thresholding).

Model Utilities

Utility functions such as load_matched_state_dict , _make_divisible , and encoder/decoder classes for MobileNetV2 and ResNet are provided to build a DeepLab‑style segmentation network, including MobileNetV2Encoder , ResNetEncoder , Decoder , ASPP , and the high‑level Base and MattingRefine classes.

Training‑Free Inference Script

The handle function parses command‑line arguments, loads the selected model checkpoint, creates a ZipDataset that reads the source, background, and new background images, and runs the model to obtain alpha matte ( pha ), foreground ( fgr ), error map, and refined output. Results are saved as PNG or JPG files, and a composited image with the new background is generated using Image.alpha_composite .

Demonstration

Running the script on sample images produces a visually appealing background‑replaced result, as shown in the article.

computer visionDeep Learningimage-mattingBackground Replacement
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.