Fundamentals 5 min read

Understanding Python Garbage Collection and Memory Leaks

This article explains Python's garbage collection mechanisms, demonstrates how reference counting works, shows how to detect and fix memory leaks using the gc module, and illustrates the issue with urllib2 through code examples and visual diagrams.

Baidu Intelligent Testing
Baidu Intelligent Testing
Baidu Intelligent Testing
Understanding Python Garbage Collection and Memory Leaks

Leader assigns Xiao Zhang a task to extract data from a batch of web pages; the script runs but the process disappears without error, prompting Xiao Zhang to seek help from senior developer Lao Wang.

Lao Wang checks the system logs and discovers the process was killed, likely due to a memory leak, and asks what language the script was written in.

When Xiao Zhang says it was written in Python, Lao Wang explains that Python uses reference counting for garbage collection, and introduces the concept of GC (Garbage Collection) and common algorithms such as Reference Count, Mark‑Sweep, Copying, and generational collection.

Python's built‑in sys.getrefcount() function can be used to inspect an object's reference count; an example shows how passing a variable to getrefcount() creates a temporary reference, increasing the count.

The article then describes the gc module, which provides utilities for detecting memory leaks. Functions like gc.garbage reveal objects that could not be reclaimed.

To illustrate that GC is not omnipotent, a code snippet uses gc.set_debug(gc.DEBUG_LEAK) and gc.collect() to enable leak detection, then examines gc.garbage to find cyclic references involving variables lz and ow that prevent proper cleanup because their __del__() methods are ambiguous.

Resolving the leak simply requires breaking the circular reference.

Later, Lao Wang reviews the source of urllib2 and identifies a circular reference to HttpResponse(r) as the cause of the leak, again suggesting that breaking the cycle fixes the problem.

The article concludes with a light‑hearted exchange between Xiao Zhang and Lao Wang, emphasizing the importance of understanding Python's memory management.

PythonGarbage Collectionmemory-leakReference Countinggc module
Baidu Intelligent Testing
Written by

Baidu Intelligent Testing

Welcome to follow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.