Why Does Thread.sleep(0) Appear in RocketMQ? Uncovering the Safepoint Trick
This article examines the puzzling 'prevent gc' comment and Thread.sleep(0) call in RocketMQ’s source, explains how it leverages JVM safepoint mechanics to trigger garbage collection, discusses counted vs uncounted loops, and demonstrates practical code modifications to improve performance.
This article starts with a strange comment "prevent gc" found in a RocketMQ source file and asks why a seemingly useless
Thread.sleep(0)call is present.
The comment means "prevent GC thread from performing garbage collection". The core logic is a single line:
Thread.sleep(0);
The author speculates that this line is intended to create a safepoint, giving the GC thread a chance to run and avoid long stop‑the‑world pauses.
The code originates from
org.apache.rocketmq.store.logfile.DefaultMappedFile#warmMappedFile. The author suggests changing the loop index type from
intto
longand removing the
iflogic.
The theoretical basis is Java safepoints. At a safepoint the JVM can pause all threads for GC, but only after reaching a safepoint. HotSpot treats loops with a small integer index as "counted loops" and may omit safepoint checks inside them. Using a
longindex makes the loop an "uncounted loop" that includes a safepoint.
Exploration
Searching the RocketMQ commit history and GitHub issues yields no clear explanation for the comment. An issue (https://github.com/apache/rocketmq/issues/4902) reproduces the discussion, and a StackOverflow question (https://stackoverflow.com/questions/53284031/why-thread-sleep0-can-prevent-gc-in-rocketmq) provides an answer stating that
Thread.sleep(0)gives the GC thread a chance to run, potentially increasing GC frequency but preventing long pauses.
The answer clarifies that the code aims to "trigger" GC rather than avoid it, effectively spreading GC work to avoid long pauses.
According to the JVM book, safepoints are required before GC can stop the world. Counted loops may delay reaching a safepoint until the loop finishes, causing the GC thread to wait.
HotSpot avoids placing safepoints in loops with small integer indices (counted loops); using a larger type (long) creates an uncounted loop with safepoints.
Examples from other articles (e.g., HBase safepoint case) illustrate the problem.
Practice
A test program demonstrates the issue:
<code>public class MainTest {
public static AtomicInteger num = new AtomicInteger(0);
public static void main(String[] args) throws InterruptedException {
Runnable runnable = () -> {
for (int i = 0; i < 1000000000; i++) {
num.getAndAdd(1);
}
System.out.println(Thread.currentThread().getName() + " execution finished!");
};
Thread t1 = new Thread(runnable);
Thread t2 = new Thread(runnable);
t1.start();
t2.start();
Thread.sleep(1000);
System.out.println("num = " + num);
}
}
</code>The main thread sleeps for 1 s but actually waits for the two long loops to finish because they are counted loops without safepoint checks.
Two long, uninterrupted loops start.
Main thread sleeps 1 s.
After 1 s the JVM tries to stop at a safepoint, but must wait for the loops to finish.
The main thread’s
Thread.sleepreturns from native code, sees a safepoint in progress, and blocks until it completes.
Changing the loop index to
longmakes the program behave normally.
Applying the same idea to the original RocketMQ code (by inserting a
Thread.sleep(0)or using a
longindex) forces a safepoint inside the loop.
Simple timing tests show little difference on the author’s machine, but the technique is considered a high‑level optimization.
Additional Note
The article also mentions a file‑pre‑warming method that writes zeros to a
ByteBufferin 4 KB chunks, referencing a Tianchi competition where participants used file pre‑warming.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.