Deep Dive into the Internal Working of malloc and the ptmalloc Memory Allocator
This article explains how the glibc malloc implementation (ptmalloc) manages memory by using arenas, chunks, and various bin structures such as fastbins, smallbins, largebins, and unsorted bins, and describes the step‑by‑step allocation process performed by public_mALLOc and its helper functions.
In modern programming languages the operating system provides system calls like mmap and brk for memory allocation, but their granularity and overhead make direct use inefficient; therefore user‑level allocators such as glibc's ptmalloc are built on top of these calls.
ptmalloc organizes memory into arenas —independent pools that reduce lock contention in multithreaded programs. The global main arena is defined as a static struct malloc_state main_arena; , and each arena contains a mutex, linked‑list pointers, and data structures for managing free chunks.
The basic allocation unit is a malloc_chunk , which consists of a header (size fields and forward/backward links) and a user data body. When malloc is called, a suitably sized chunk is taken from the arena and its body address is returned.
Free chunks are organized into four kinds of bins:
fastbins : very small fixed‑size bins for the most common small allocations (up to MAX_FAST_SIZE ).
smallbins : manage chunks from 32 bytes up to MIN_LARGE_SIZE (1024 bytes) with a fixed size step.
largebins : handle larger chunks with non‑uniform size classes.
unsortedbins : a cache of recently freed chunks that may be split or moved into the appropriate bin on the next allocation.
Each bin is represented inside struct malloc_state (e.g., mfastbinptr fastbins[NFASTBINS]; , mchunkptr bins[NBINS*2]; , etc.). The special top chunk holds the remaining memory of the arena and is used when no suitable free chunk exists in the bins.
The allocation routine public_mALLOc selects an arena, locks it, and delegates the actual work to _int_malloc . _int_malloc normalizes the request size, then attempts allocation in the following order: fastbins, smallbins, unsorted bins, largebins, the top chunk, and finally falls back to a system call via sYSMALLOc (which uses mmap ).
During these attempts, chunks may be split to satisfy smaller requests or coalesced to reduce fragmentation. The process stops as soon as a suitable chunk is found, ensuring efficient memory use while keeping allocation overhead low.
Overall, the article provides a comprehensive overview of glibc’s ptmalloc design, its data structures, and the step‑by‑step logic that underlies the familiar malloc function.
Refining Core Development Skills
Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.