In‑Depth Study of Go’s Garbage Collection Algorithm and Its Evolution
This article provides a comprehensive analysis of Go’s non‑generational concurrent mark‑and‑sweep garbage collector, tracing its evolution from early stop‑the‑world implementations to the mixed write‑barrier design in Go 1.8, and explains how to interpret GC traces, tune GC parameters, and reduce latency caused by mark‑assist and STW pauses.
Introduction
The Go runtime uses a concurrent, precise, non‑generational mark‑and‑sweep (CMS) garbage collector that has matured over twelve versions. The collector runs concurrently with mutator goroutines, employs a three‑color marking algorithm, and does not perform heap compaction.
Mark‑Sweep Algorithm
Mark‑sweep is a tracing GC that first marks all reachable objects from GC roots and then sweeps away the unmarked (dead) objects. The marking phase traverses the object graph, and the sweep phase reclaims memory after marking completes.
Go GC Evolution
Pre‑1.3: Full stop‑the‑world (STW) mark‑and‑sweep.
1.3: Mark phase STW, sweep runs concurrently.
1.5: Introduced three‑color marking; both mark and sweep are concurrent, with brief STW windows for setup and termination.
1.8: Added a mixed write‑barrier (rescan) to reduce mark‑termination time.
GC Process in Go 1.5
The GC cycle consists of:
Sweep Termination – collect roots, finish previous sweep, enable write barrier and assist GC.
Mark – scan roots and reachable objects.
Mark Termination – final STW to finish marking and disable the write barrier.
Sweep – reclaim memory marked as dead.
Write‑Barrier Code (Go 1.5)
writePointer(slot, ptr):
shade(ptr)
*slot = ptrMixed Write‑Barrier (Go 1.8)
Go 1.8 combines Dijkstra’s write barrier and Yuasa’s delete barrier:
writePointer(slot, ptr):
shade(*slot)
if current stack is grey:
shade(ptr)
*slot = ptrGC Phases Detailed
Mark Setup – opens the write barrier, pausing each goroutine for a few microseconds.
Marking – runs concurrently on a dedicated P, using about 25 % of CPU; goroutine‑assist (Mark Assist) helps when allocation pressure is high.
Mark Termination – final STW to close the write barrier and clean up.
Concurrent Sweep – reclaims dead objects during normal allocation; the sweep work is distributed across P’s.
GC Trace Example
Setting GODEBUG=gctrace=1 prints lines such as:
gc 1405 @6.068s 11%: 0.058+1.2+0.083 ms clock, 0.70+2.5/1.5/0+0.99 ms cpu, 7->11->6 MB, 10 MB goal, 12 PThe fields show STW times, concurrent marking time, CPU usage, heap sizes before/after marking, live heap size, collection goal, and number of logical processors (P).
Tuning Recommendations
Keep the heap as small as possible.
Maintain a stable GC target (GC percentage).
Stay within the per‑collection memory goal.
Minimize both STW pause duration and Mark‑Assist time.
Reducing allocation pressure (e.g., eliminating unnecessary allocations) directly lowers GC‑induced latency and improves overall throughput.
Conclusion
Understanding Go’s GC algorithm, its evolution, and the impact of its phases enables developers to tune GC parameters, interpret trace output, and design applications that minimize memory‑pressure‑induced pauses, thereby achieving high performance without manual memory management.
High Availability Architecture
Official account for High Availability Architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.