Understanding Go's Automatic Memory Management and Garbage Collection (Based on TCMalloc)
Based on Google’s TCMalloc, Go’s automatic memory system organizes memory into pages and spans, uses per‑processor mcache and shared mcentral for size‑class allocation, and employs a concurrent tri‑color mark‑sweep garbage collector triggered by heap usage or large allocations, illustrating the full allocation‑to‑reclamation pipeline.
Introduction
Modern high‑level programming languages manage memory either manually (e.g., C, C++) or automatically (e.g., Java, Go). Automatic memory management hides allocation and reclamation from developers, but understanding the underlying design and execution logic is still valuable. This article uses Go’s memory management as a case study and explains the design and principles of Go’s automatic memory system.
1. TCMalloc Overview
TCMalloc (Thread‑Caching Malloc) is Google’s memory allocator. Its key concepts are:
(1) Page
Memory is managed in pages. In TCMalloc a page’s size may differ from the OS page size but is a multiple of it.
(2) Span
A Span is a contiguous group of pages. It is the basic unit for managing memory blocks.
(3) ThreadCache
Each thread has its own cache containing free‑list chains of memory blocks of the same size. Access is lock‑free.
(4) CentralCache
A shared cache for all threads; it also holds free‑list chains and requires locking.
(5) PageHeap
The heap abstraction that stores spans. When CentralCache runs out of memory it obtains spans from PageHeap.
(6) Object Allocation
Small objects are allocated from ThreadCache; if insufficient, CentralCache supplies memory, which in turn may request spans from PageHeap. Large objects are allocated directly from PageHeap.
2. Go Memory Management
Go’s allocator is heavily inspired by TCMalloc but has its own details. The architecture consists of the following components:
(1) Page
Same concept as in TCMalloc; a light‑blue rectangle in the diagram represents a page.
(2) Span
Implemented as mspan in Go. A span is a group of contiguous pages; a purple rectangle in the diagram denotes a span.
(3) mcache
Analogous to ThreadCache, but each P (processor) has one mcache, allowing lock‑free access for up to GOMAXPROCS workers.
(4) mcentral
Similar to CentralCache, shared among all threads and protected by a lock. It maintains span classes.
(5) mheap
Corresponds to PageHeap. It organizes OS‑provided pages into spans, stores them in two trees, and manages address mapping and pointer‑bitmap information.
(6) Size Classes
Go defines 66 size classes (plus a special class for objects >32 KB). Each class specifies object size, span size, number of objects per span, and waste percentages. Example snippet from runtime.gosizeclasses.go :
// class bytes/obj bytes/span objects tail waste max waste
// 1 8 8192 1024 0 87.50%
// 2 16 8192 512 0 43.75%
// ...
// 66 32768 32768 1 0 12.50%Three arrays ( class_to_size , size_to_class , class_to_allocnpages ) map between object size, size class, and span class.
(7) Allocation Process
For a small object (e.g., 20 bytes) the allocator:
Computes the required size.
Finds the size class (class 3 in the example).
Derives the span class using makeSpanClass :
func makeSpanClass(sizeclass uint8, noscan bool) spanClass {
return spanClass(sizeclass<<1) | spanClass(bool2int(noscan))
}Resulting span class is 6, and the object is allocated from the corresponding span.
Large objects (>32 KB) are allocated directly from mheap after calculating required pages and span class.
3. Garbage Collection
Go uses a concurrent, tri‑color, mark‑sweep collector.
(1) Mark‑Sweep
The classic algorithm marks reachable objects (mark phase) and then reclaims unreachable ones (sweep phase). It is a stop‑the‑world (STW) process.
(2) Tri‑color Marking
Objects start white, become gray when discovered, and black after their children are processed. This allows concurrent marking with the mutator.
(3) GC Trigger
GC is triggered when allocating a large object or when memstats.heap_live >= memstats.gc_trigger . The trigger logic is in gcShouldStart :
func gcShouldStart(forceTrigger bool) bool {
return gcphase == _GCoff && (forceTrigger || memstats.heap_live >= memstats.gc_trigger) && memstats.enablegc && panicking == 0 && gcpercent >= 0
}The GOGC environment variable controls the trigger threshold.
(4) GC Process
The entry point gcStart sets up background mark workers, stops the world, prepares marking, and then starts the world again. Mark workers run per‑P goroutines ( gcBgMarkWorker ) that repeatedly scan roots and gray objects using gcDrain :
func gcDrain(gcw *gcWork, flags gcDrainFlags) {
// Drain root marking jobs
// Drain heap marking jobs
// Scan gray objects and push newly discovered ones
}After marking, gcSweep reclaims memory. Sweep can be synchronous or concurrent. The core sweep routine sweepone processes each span:
func sweepone() uintptr {
// Iterate over spans, call s.sweep(false)
}Both mark and sweep phases operate on spans, making the span‑based design central to Go’s memory management.
References
1. 《Go语言设计与实现》 (Go Language Design and Implementation)
Author: 冷易, Tencent backend engineer, experienced in Go and high‑performance backend systems.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.