Fundamentals 23 min read

Understanding CPython's Memory Management (Python 2.7)

This article explains how CPython implements its own memory‑management scheme—including the layered allocation model, pool and arena structures, block size classes, and reference‑count‑based reclamation—by dissecting the source code of Python 2.7’s obmalloc module.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Understanding CPython's Memory Management (Python 2.7)

Memory management is the cornerstone of any C/C++‑based system program, and the native Python interpreter is written in C with its own allocation scheme. This article explores the implementation details of that scheme in Python 2.7, showing how the runtime organizes memory, allocates objects, and reclaims garbage.

Overview : The memory hierarchy can be visualised as a cake. The bottom layer (‑2) is the physical storage (RAM and secondary storage). Layer ‑1 is the operating‑system kernel that manages physical memory. Layer 0 requests memory from the OS using the standard malloc library. Above layer 0 sit the Python memory‑management layers.

At layer 1 the PyMem API abstracts away platform differences in malloc / free . For example, PyMem_MALLOC(0) is transformed into malloc(1) to avoid allocating zero bytes. The relevant macros are:

<code>#define PyMem_MALLOC(n)        ((size_t)(n) > (size_t)PY_SSIZE_T_MAX ? NULL \
                : malloc((n) ? (n) : 1))
#define PyMem_REALLOC(p, n)    ((size_t)(n) > (size_t)PY_SSIZE_T_MAX  ? NULL \
                : realloc((p), (n) ? (n) : 1))
#define PyMem_FREE        free
</code>

Pool : A pool is a 4 KB block (the size of a system memory page). When an object needs memory, a block is cut from a pool. Pool size classes are fixed: 8, 16, 32, … 256 bytes. Requests larger than 256 bytes bypass the pool and are satisfied directly by malloc . The size‑class table looks like:

<code>Request in bytes     Size of allocated block      Size class idx
----------------------------------------------------------------
        1-8                     8                       0
        9-16                   16                       1
       17-24                   24                       2
       25-32                   32                       3
       33-40                   40                       4
       41-48                   48                       5
       49-56                   56                       6
       57-64                   64                       7
       65-72                   72                       8
        ...                   ...                     ...
      241-248                 248                      30
      249-256                 256                      31
</code>

A pool can be in three states:

used : some blocks have been allocated but free blocks remain.

full : all blocks are allocated.

empty : no blocks have been allocated yet.

The pool header stores bookkeeping information:

<code>struct pool_header {
    union { block *_padding; uint count; } ref;   /* number of allocated blocks */
    block *freeblock;                            /* head of pool's free list */
    struct pool_header *nextpool;               /* next pool of this size class */
    struct pool_header *prevpool;               /* previous pool */
    uint arenaindex;                            /* index into arenas array */
    uint szidx;                                 /* block size‑class index */
    uint nextoffset;                            /* bytes to virgin block */
    uint maxnextoffset;                         /* largest valid nextoffset */
};
</code>

Pool initialization (simplified) is performed as follows:

<code>#define ALIGNMENT_SHIFT         3
#define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)
#define ALIGNMENT               8
#define ALIGNMENT_MASK          (ALIGNMENT - 1)
#define ROUNDUP(x)              (((x) + ALIGNMENT_MASK) &amp; ~ALIGNMENT_MASK)
#define POOL_OVERHEAD           ROUNDUP(sizeof(struct pool_header))

pool-&gt;szidx = size;               /* size is the size‑class index */
size = INDEX2SIZE(size);          /* convert index to actual block size */
bp = (block *)pool + POOL_OVERHEAD;   /* start of first usable block */
pool-&gt;nextoffset = POOL_OVERHEAD + (size &lt;&lt; 1);
pool-&gt;maxnextoffset = POOL_SIZE - size;
pool-&gt;freeblock = bp + size;
*(block **)(pool-&gt;freeblock) = NULL;
UNLOCK();
return (void *)bp;
</code>

Used‑pools array : Python keeps a special array usedpools that indexes pools by their size‑class. The array is populated with pointers to the pool headers and is used to locate a suitable pool for a given allocation request.

<code>#define SMALL_REQUEST_THRESHOLD 256
#define NB_SMALL_SIZE_CLASSES   (SMALL_REQUEST_THRESHOLD / ALIGNMENT)  /* 32 */
#define PTA(x)  ((poolp )((uchar *)&amp;(usedpools[2*(x)]) - 2*sizeof(block *)))
#define PT(x)   PTA(x), PTA(x)
static poolp usedpools[2 * ((NB_SMALL_SIZE_CLASSES + 7) / 8) * 8] = {
    PT(0), PT(1), PT(2), PT(3), PT(4), PT(5), PT(6), PT(7)
    /* ... up to PT(31) ... */
};
</code>

Arena : When no pool of the required size class is available, Python allocates a 256 KB chunk called an arena and slices it into 4 KB pools. The arena structure is:

<code>struct arena_object {
    uptr address;               /* base address of the 256 KB memory */
    block *pool_address;        /* start of the next virgin pool */
    uint nfreepools;             /* remaining free pools */
    uint ntotalpools;           /* total pools (normally 64) */
    struct pool_header *freepools; /* linked list of completely free pools */
    struct arena_object *nextarena;
    struct arena_object *prevarena;
};
static struct arena_object *arenas = NULL;          /* array of arena objects */
static struct arena_object *unused_arena_objects = NULL; /* free arena slots */
static struct arena_object *usable_arenas = NULL;   /* arenas with free pools */
</code>

Arena allocation grows the arenas array (initially 16 slots) and links newly created arenas into unused_arena_objects . When a pool is needed, an arena supplies a 4 KB slice, which is then turned into a pool and inserted into usedpools .

<code>if (unused_arena_objects == NULL) {
    uint i, numarenas;
    size_t nbytes;
    numarenas = maxarenas ? maxarenas &lt;&lt; 1 : INITIAL_ARENA_OBJECTS; /* 16, then double */
    nbytes = numarenas * sizeof(*arenas);
    arenaobj = (struct arena_object *)realloc(arenas, nbytes);
    if (arenaobj == NULL) return NULL;
    arenas = arenaobj;
    for (i = maxarenas; i < numarenas; ++i) {
        arenas[i].address = 0;
        arenas[i].nextarena = i &lt; numarenas-1 ? &amp;arenas[i+1] : NULL;
    }
    unused_arena_objects = &amp;arenas[maxarenas];
    maxarenas = numarenas;
}
</code>

When a pool becomes completely free, it is moved from usedpools to the arena’s freepools list. If an arena’s nfreepools reaches ntotalpools , the entire arena is returned to the operating system:

<code>if (nf == ao-&gt;ntotalpools) {
    /* unlink from usable_arenas */
    if (ao-&gt;prevarena == NULL) usable_arenas = ao-&gt;nextarena;
    else ao-&gt;prevarena-&gt;nextarena = ao-&gt;nextarena;
    if (ao-&gt;nextarena) ao-&gt;nextarena-&gt;prevarena = ao-&gt;prevarena;
    /* recycle arena slot */
    ao-&gt;nextarena = unused_arena_objects;
    unused_arena_objects = ao;
    free((void *)ao-&gt;address);
    ao-&gt;address = 0;
    --narenas_currently_allocated;
    UNLOCK();
    return;
}
</code>

Memory reclamation : Every Python object carries a reference count ( refcnt ). When it drops to zero, PyObject_Free returns the memory to its pool. Depending on the pool’s current state (used, full, empty) the function updates the pool’s free‑block list, possibly moves the pool back into usedpools , or transfers it to the arena’s freepools list.

<code>assert(pool-&gt;ref.count &gt; 0);
*(block **)p = lastfree = pool-&gt;freeblock;
pool-&gt;freeblock = (block *)p;
if (lastfree) {
    if (--pool-&gt;ref.count != 0) { UNLOCK(); return; }
    /* pool becomes empty – move to arena's freepools */
    next = pool-&gt;nextpool; prev = pool-&gt;prevpool;
    next-&gt;prevpool = prev; prev-&gt;nextpool = next;
    ao = &amp;arenas[pool-&gt;arenaindex];
    pool-&gt;nextpool = ao-&gt;freepools; ao-&gt;freepools = pool;
    ++ao-&gt;nfreepools;
    UNLOCK();
    return;
}
/* other cases handle transition from full→used or used→full */
</code>

The overall principle of CPython’s memory manager is simple: never hold onto memory that can be released, and return it to the system as early as possible. In Python 3 the maximum block size grew to 512 bytes (size‑class index up to 63), but the overall architecture of pools, arenas, and reference‑count‑driven reclamation remains unchanged.

Memory ManagementPythonGarbage CollectionCPythonObmalloc
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.