Fundamentals 8 min read

Python Performance Optimization Techniques: Built‑in Functions, List Comprehensions, Generators, Caching, NumPy, Multiprocessing and More

This article introduces a range of Python performance‑optimization methods—including built‑in functions, list comprehensions, generator expressions, avoiding globals, functools.lru_cache, NumPy, pandas, multiprocessing, Cython, PyPy, and line_profiler—illustrated with clear code examples and a practical image‑processing case study.

Python Programming Learning Circle

Nov 21, 2024

Python Performance Optimization Techniques: Built‑in Functions, List Comprehensions, Generators, Caching, NumPy, Multiprocessing and More

Performance optimization refers to the process of improving program execution efficiency, which is essential for handling large data, real‑time applications, or resource‑constrained environments.

1. Use built‑in functions – Functions such as sum(), max() and min() are implemented in C and run faster than equivalent Python code.

# Using built‑in sum() to calculate the total of a list
numbers = [1, 2, 3, 4, 5]
total = sum(numbers)
print(total)  # Output: 15

2. List comprehensions provide a concise and often faster way to build lists compared with traditional for‑loops.

# Traditional loop
squares = []
for i in range(10):
    squares.append(i ** 2)
print(squares)  # Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

# List comprehension
squares = [i ** 2 for i in range(10)]
print(squares)  # Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

3. Generator expressions are lazy‑evaluated versions of list comprehensions, generating items on demand and saving memory for large datasets.

# Generator expression
squares_gen = (i ** 2 for i in range(10))
print(list(squares_gen))  # Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

4. Avoid global variables – Accessing globals is slower than accessing locals; keep variables inside functions whenever possible.

# Global variable example
x = 10

def global_var():
    return x + 1
print(global_var())  # Output: 11

# Local variable example
def local_var():
    y = 10
    return y + 1
print(local_var())  # Output: 11

5. Use functools.lru_cache to memoize function results, which is especially useful for recursive functions.

import functools

@functools.lru_cache(maxsize=128)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

print(fibonacci(30))  # Output: 832040

6. Leverage numpy and pandas for efficient numerical computation and data manipulation.

import numpy as np

# Compute squares of an array using NumPy
array = np.array([1, 2, 3, 4, 5])
squared = array ** 2
print(squared)  # Output: [ 1  4  9 16 25]

7. Use the multiprocessing module to run code in parallel across multiple CPU cores.

import multiprocessing

def worker(num):
    return num * num

if __name__ == "__main__":
    with multiprocessing.Pool(processes=4) as pool:
        results = pool.map(worker, [1, 2, 3, 4, 5])
    print(results)  # Output: [1, 4, 9, 16, 25]

8. Compile performance‑critical sections with Cython to get C‑level speed while writing Python‑like code.

# example.pyx
cdef int square(int x):
    return x * x

# setup.py
from setuptools import setup
from Cython.Build import cythonize

setup(ext_modules=cythonize("example.pyx"))

# Use the compiled module
import example
print(example.square(5))  # Output: 25

9. Run code on PyPy , a JIT‑enabled Python interpreter that can dramatically speed up many workloads.

# Install PyPy
sudo apt-get install pypy

# Execute a script with PyPy
pypy my_script.py

10. Profile with line_profiler to locate bottlenecks and focus optimization efforts.

# Install line_profiler
pip install line_profiler

@profile
def my_function():
    a = [1] * 1000000
    b = [2] * 1000000
    del a
    return b

my_function()

Practical case: Optimizing image processing – Using the Pillow library together with multiprocessing to convert images to grayscale, resize them, and save the results in parallel.

from PIL import Image
import os
import multiprocessing

def process_image(image_path):
    # Open image
    image = Image.open(image_path)
    # Convert to grayscale
    gray_image = image.convert('L')
    # Resize image
    resized_image = gray_image.resize((100, 100))
    # Save processed image
    output_path = f"processed_{os.path.basename(image_path)}"
    resized_image.save(output_path)
    print(f"Processed {image_path} and saved to {output_path}")

def main():
    image_paths = ["image1.jpg", "image2.jpg", "image3.jpg"]
    # Use multiprocessing to handle images concurrently
    with multiprocessing.Pool(processes=4) as pool:
        pool.map(process_image, image_paths)

if __name__ == "__main__":
    main()

Conclusion – By applying built‑in functions, list comprehensions, generator expressions, avoiding globals, caching with functools.lru_cache, using NumPy/Pandas, multiprocessing, Cython, PyPy, and profiling tools like line_profiler, developers can significantly improve the speed and efficiency of Python programs, as demonstrated in the image‑processing example.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization built-in functions list-comprehension multiprocessing

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.