Understanding JVM Garbage Collection: OopMap, Safepoints, Memory Barriers, and Low‑Latency Collectors
This article explains the internal mechanisms of JVM garbage collection, covering oopMap, safepoints, saferegions, three‑color marking, memory sets, write barriers, and the design of low‑latency collectors such as Shenandoah and ZGC, and provides practical tuning advice.
Introduction
The previous article introduced the execution flow of traditional JVM garbage collectors; this piece focuses on the problems identified in those collectors.
Key Terminology
oopMap
During reachability analysis the JVM must traverse the object heap starting from GC Roots (constants, static fields, stack locals). Scanning all roots is costly, so the JVM records reference locations at safepoints in an oopMap , allowing the collector to use this map instead of a full root scan. The map is updated only when a batch of instructions changes reference locations.
Safepoint
A safepoint is a place where the JVM can stop all Java threads and record the current oopMap . The frequency of safepoints balances pause time against the overhead of recording the map. Two interruption strategies exist: pre‑emptive interruption (rarely used) and cooperative interruption , where threads poll a flag and stop at the nearest safepoint. JVM options such as -XX:+PrintSafepointStatistics , -XX:+SafepointTimeout and -XX:SafepointTimeoutDelay=2000 can be used for diagnostics (removed in JDK 17/21).
Saferegion
Threads that are sleeping or blocked cannot reach a safepoint, so the JVM defines a Saferegion – a code segment where references are guaranteed not to change, allowing the collector to safely run without needing the thread to stop.
Three‑Color Marking and Floating Garbage
After the initial marking phase, the collector may run concurrently with application threads. Objects are colored white (unvisited), gray (visited but references not fully scanned), or black (fully visited). Concurrent updates can cause "floating garbage" when a reference is moved while another thread is scanning, leading to missed live objects. Two remediation strategies are used:
Incremental update : record new references from black objects to white objects and rescan them after the concurrent phase.
Original snapshot : record deletions of references from gray to white objects and treat the deleted objects as new roots.
Memory Set (Remembered Set)
To avoid scanning the entire heap for cross‑generation references, collectors maintain a remembered set that records references from the young generation to the old generation. Implementations include:
Card Table : the HotSpot VM divides memory into 512‑byte cards; a card is marked dirty if it contains a cross‑generation reference.
RSet : G1 uses a region‑level set that records which regions point to a given region, enabling finer‑grained scanning at the cost of additional memory.
Write Barrier
The JVM uses a write barrier to update the remembered set when a reference is written. The barrier records the change in a queue and processes it later, using color‑coded states (white, green, yellow, red) to indicate the progress of reference‑updating workers.
Low‑Latency Garbage Collectors
Shenandoah
Shenandoah is an evolution of G1 that reduces pause times by performing most work concurrently. Its phases include initial marking, concurrent marking, final marking, concurrent cleanup, concurrent evacuation, initial reference update, concurrent reference update, final reference update, and a final cleanup. It uses forwarding pointers (Brooks pointers) instead of stopping the world during the evacuation phase, but the heavy use of CAS and memory barriers can affect throughput.
ZGC
ZGC, inspired by the C4 collector, is a region‑based, mostly non‑generational collector (since JDK 21 it supports generations). Regions are classified as small (2 MiB), medium (32 MiB), and large (multiple of 2 MiB). Its pipeline consists of initial marking (using oopMap ), concurrent marking (marks pointers, not objects), concurrent pre‑allocation, concurrent relocation with a forwarding table, and concurrent reference remapping. ZGC stores a few bits in the high part of 64‑bit pointers (color pointers) to indicate object state, limiting its addressable memory to about 4 TiB.
Special Garbage Collector
Epsilon
Epsilon is a no‑op collector that never reclaims memory; it is useful for short‑lived workloads such as Kubernetes jobs where the process finishes before memory is exhausted.
Recommendations
For different workload characteristics, the following JVM GC combinations are suggested:
Serial + Serial Old – suitable for small memory, single‑CPU pods.
Parallel Scavenge + Parallel Old – good for throughput‑oriented, latency‑insensitive services.
ParNew + CMS – low‑latency, but may suffer long pauses on full GC for large heaps.
G1 – preferred after JDK 11 due to parallel full GC and better pause characteristics.
Shenandoah – for applications that prioritize response time over throughput.
ZGC – for applications with extreme latency requirements, though it is still evolving.
Further reading links are provided for Java Memory Model, dynamic planning, and mmap techniques.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.