Simple Techniques to Accelerate Python For‑Loops: From 1.3× to 970× Speed‑ups
This article presents a collection of practical Python tricks—such as list comprehensions, pre‑computing lengths, using sets, skipping irrelevant iterations, inlining functions, generators, map, memoization, vectorization, filterfalse, and join—to dramatically improve for‑loop performance, with benchmark results ranging from modest 1.3× gains up to a staggering 970× acceleration.
The article demonstrates how to use Python's timeit module to benchmark loop performance and then applies a series of optimizations that can increase execution speed anywhere from 1.3× to 970×.
Simple Methods
1. List Comprehension
# Baseline version (Inefficient way)
# Calculating the power of numbers
# Without using List Comprehension
def test_01_v0(numbers):
output = []
for n in numbers:
output.append(n**2.5)
return output
# Improved version (Using List Comprehension)
def test_01_v1(numbers):
output = [n**2.5 for n in numbers]
return outputResult: 2× speed‑up.
2. Compute Length Outside the Loop
# Baseline version (Length calculation inside for loop)
def test_02_v0(numbers):
output_list = []
for i in range(len(numbers)):
output_list.append(i*2)
return output_list
# Improved version (Length calculation outside for loop)
def test_02_v1(numbers):
my_list_length = len(numbers)
output_list = []
for i in range(my_list_length):
output_list.append(i*2)
return output_listResult: 1.6× speed‑up.
3. Use Set for Membership Tests
# Baseline version (nested lookups using for loop)
def test_03_v0(list_1, list_2):
common_items = []
for item in list_1:
if item in list_2:
common_items.append(item)
return common_items
# Improved version (sets to replace nested lookups)
def test_03_v1(list_1, list_2):
s_1 = set(list_1)
s_2 = set(list_2)
common_items = s_1.intersection(s_2)
return common_itemsResult: 498× speed‑up for nested loops.
4. Skip Irrelevant Iterations
# Inefficient version
def function_do_something(numbers):
for n in numbers:
square = n*n
if square % 2 == 0:
return square
return None
# Improved version
def function_do_something_v1(numbers):
even_numbers = [i for i in numbers if i % 2 == 0]
for n in even_numbers:
square = n*n
return square
return NoneResult: ~1.94× speed‑up.
5. Code Merging (Inlining Functions)
# Baseline version calling is_prime inside a loop
def is_prime(n):
if n <= 1:
return False
for i in range(2, int(n**0.5)+1):
if n % i == 0:
return False
return True
def test_05_v0(n):
count = 0
for i in range(2, n+1):
if is_prime(i):
count += 1
return count
# Improved version with inlined logic
def test_05_v1(n):
count = 0
for i in range(2, n+1):
if i <= 1:
continue
for j in range(2, int(i**0.5)+1):
if i % j == 0:
break
else:
count += 1
return countResult: ~1.35× speed‑up.
6. Generators
# Baseline (list‑based Fibonacci)
def test_08_v0(n):
if n <= 1:
return n
f_list = [0, 1]
for i in range(2, n+1):
f_list.append(f_list[i-1] + f_list[i-2])
return f_list[n]
# Improved (generator‑based)
def test_08_v1(n):
a, b = 0, 1
for _ in range(n):
yield a
a, b = b, a + bResult: ~22× speed‑up.
7. map() Function
# Baseline using explicit loop
def some_function_X(x):
return x**2
def test_09_v0(numbers):
output = []
for i in numbers:
output.append(some_function_X(i))
return output
# Improved using map
def test_09_v1(numbers):
output = map(some_function_X, numbers)
return outputResult: ~970× speed‑up because map is a C‑implemented iterator.
8. Memoization with lru_cache
import functools
@functools.lru_cache()
def fibonacci_v2(n):
if n == 0:
return 0
elif n == 1:
return 1
return fibonacci_v2(n-1) + fibonacci_v2(n-2)
def _test_10_v1(numbers):
output = []
for i in numbers:
output.append(fibonacci_v2(i))
return outputResult: ~58× speed‑up.
9. Vectorization with NumPy
import numpy as np
def test_11_v0(n):
output = 0
for i in range(n):
output += i
return output
def test_11_v1(n):
output = np.sum(np.arange(n))
return outputResult: ~28× speed‑up.
10. Avoid Creating Intermediate Lists (filterfalse)
# Baseline using filter + list conversion
def test_12_v0(numbers):
filtered_data = []
for i in numbers:
filtered_data.extend(list(filter(lambda x: x % 5 == 0, range(1, i**2))))
return filtered_data
# Improved using itertools.filterfalse
from itertools import filterfalse
def test_12_v1(numbers):
filtered_data = []
for i in numbers:
filtered_data.extend(list(filterfalse(lambda x: x % 5 != 0, range(1, i**2))))
return filtered_dataResult: ~131× speed‑up and lower memory usage.
11. Efficient String Concatenation
# Baseline using +=
def test_13_v0(l_strings):
output = ""
for a_str in l_strings:
output += a_str
return output
# Improved using list + join
def test_13_v1(l_strings):
output_list = []
for a_str in l_strings:
output_list.append(a_str)
return "".join(output_list)Result: ~1.5× speed‑up because join runs in O(n) time versus O(n²) for repeated += .
Conclusion
The article summarizes eleven practical techniques that collectively can boost Python for‑loop performance from modest 1.3× improvements to extreme 970× accelerations, emphasizing the trade‑off between readability and raw speed.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.