Artificial Intelligence 10 min read

Common Python Libraries for Computer Vision Projects

This article introduces and compares ten widely used Python libraries for computer vision, including Pillow, OpenCV, Mahotas, Scikit‑Image, TensorFlow Image, PyTorch Vision, SimpleCV, Imageio, Albumentations, and timm, highlighting their features, typical use cases, and providing code examples for each.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Common Python Libraries for Computer Vision Projects

In this article we compile a list of commonly used Python libraries for computer vision projects, which can help beginners and practitioners get started with image processing and advanced machine learning tasks.

1. PIL/Pillow

Pillow is a user‑friendly Python library that provides a rich set of functions and supports many image formats, making it essential for handling images in projects. It allows opening, manipulating, and saving images, performing basic operations such as cropping, resizing, rotating, and color changes, as well as adding text and shapes. Pillow is also the image processing backend used by torchvision.

2. OpenCV (Open Source Computer Vision Library)

OpenCV is one of the most popular image‑processing libraries, originally developed by Intel and widely adopted in computer‑vision applications. It offers a vast collection of algorithms, is highly optimized for real‑time use, and reads images in BGR order. When mixing with other libraries you may need to convert color spaces:

<code>cv2.cvtColor(image, cv2.COLOR_BGR2RGB)</code>

3. Mahotas

Mahotas provides a set of high‑performance C++‑backed functions for image processing and computer vision, supporting multithreading. It includes morphological operations, segmentation, and other basic tasks, offering a simpler API than OpenCV while maintaining comparable speed.

Example:

<code># import using ``mh`` abbreviation which is common:
import mahotas as mh

# Load one of the demo images
im = mh.demos.load('nuclear')

# Automatically compute a threshold
T_otsu = mh.thresholding.otsu(im)

# Label the thresholded image
seeds, nr_regions = mh.label(im > T_otsu)

# Call seeded watershed to expand the threshold
labeled = mh.cwatershed(im.max() - im, seeds)
</code>

Another simple example using mahotas.distance to compute a distance map:

<code>import pylab as p
import numpy as np
import mahotas as mh

f = np.ones((256, 256), bool)
f[200:,240:] = False
f[128:144,32:48] = False

dmap = mh.distance(f)
p.imshow(dmap)
p.show()
</code>

4. Scikit‑Image

Built on top of Scikit‑Learn, Scikit‑Image offers advanced image‑processing capabilities, supporting multi‑dimensional images and seamless integration with NumPy and SciPy.

<code>from skimage import data, io, filters
image = data.coins()
edges = filters.sobel(image)
io.imshow(edges)
io.show()
</code>

5. TensorFlow Image

TensorFlow Image is a TensorFlow module for image decoding, encoding, cropping, resizing, and transformation, leveraging GPU acceleration for large datasets. It can be used in training pipelines via tf.keras.utils.image_dataset_from_directory :

<code>batch_size = 32
img_height = 180
img_width = 180

train_ds = tf.keras.utils.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="training",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)
</code>

6. PyTorch Vision

Part of the PyTorch ecosystem, PyTorch Vision provides utilities for image‑related machine‑learning tasks.

<code>import torchvision
video_path = "path to a test video"
reader = torchvision.io.VideoReader(video_path, "video")
reader_md = reader.get_metadata()
print(reader_md["video"]["fps"])
video.set_current_stream("video:0")
</code>

7. SimpleCV

SimpleCV builds on OpenCV, PIL, and NumPy to offer a simple API for loading, processing, and analyzing images, aimed at beginners.

<code>import SimpleCV
camera = SimpleCV.Camera()
image = camera.getImage()
image.show()
</code>

8. Imageio

Imageio provides a simple API for reading and writing a wide range of image and video formats.

<code>import imageio.v3 as iio
im = iio.imread('imageio:chelsea.png')  # read a standard image
im.shape  # (300, 451, 3)
iio.imwrite('chelsea.jpg', im)  # convert to jpg
</code>

9. Albumentations

Albumentations is a fast and flexible library for image augmentation, supporting masks and bounding boxes.

<code>import albumentations as A
import cv2

transform = A.Compose([
    A.RandomCrop(width=256, height=256),
    A.HorizontalFlip(p=0.5),
    A.RandomBrightnessContrast(p=0.2),
])

image = cv2.imread("image.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
transformed = transform(image=image)
transformed_image = transformed["image"]
</code>

10. timm

timm is a PyTorch model library offering a large collection of pretrained computer‑vision models.

<code>import timm
import torch
model = timm.create_model('resnet34')
x = torch.randn(1, 3, 224, 224)
model(x).shape
</code>

Summary

Whether you are just starting with basic image processing or exploring advanced machine‑learning models, these libraries provide essential tools for a wide range of computer‑vision tasks.

machine learningcomputer visionPythonimage processingLibrariesopencvpillow
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.