Understanding Performance Slowdown and GC Behavior During Sudden Traffic Spikes in Java Backend Services
The article explains why a Java‑based backend experiences higher CPU usage, reduced thread count, and frequent Young GC when traffic spikes, detailing the impact of Tomcat thread limits, JVM garbage‑collection pauses, and memory reclamation on overall service performance.
Author: Zhang Songran Bio: Zhang Songran, Architect in JD.com Merchant R&D Department, with 10 years of software development and design experience, specializing in building high‑performance, highly‑available large‑scale distributed systems.
Question
Recently we received performance alerts indicating method slowdown caused by a sudden traffic surge. MySQL queries appeared fast, yet JVM metrics showed increased Young GC and high CPU usage while the thread count unexpectedly dropped. Why does thread count decrease when Young GC creates new threads?
Answer
Why is performance slower?
Even though the dependent MySQL database is not a bottleneck, the sudden traffic surge forces the system to spawn many threads. These threads enter a Running or Runnable state, causing the operating system to perform time‑slicing, which raises CPU utilization. While multithreading can increase overall throughput, the response time of an individual request becomes longer.
Typical deployments use Nginx in front of Tomcat. Tomcat’s default maximum thread count is 1000, and versions prior to Tomcat 7 use BIO mode. When the thread limit is exceeded, additional requests are queued in Tomcat’s blocking queue, further increasing perceived latency for callers.
Why did the thread count decrease?
Our JVM configuration shows that Young GC uses the PS Scavenge collector and Full GC uses PS MarkSweep.
(Image source: Internet)
Parallel GC means multiple GC threads work concurrently while user threads wait. Concurrent GC allows user threads and GC threads to run at the same time, though not necessarily in parallel; user threads continue execution while GC runs on another CPU.
Thus, Young GC (PS Scavenge) triggers a stop‑the‑world pause, blocking user threads and temporarily reducing the visible thread count. Once GC finishes, user threads resume, and the thread count rises again.
Why are there so many Young GCs?
Young GC collects the Young Generation. When the Eden space is insufficient, both Eden and Survivor spaces are reclaimed. Each thread creates its own stack space, and with many concurrent threads, memory allocation and release happen frequently, leading to frequent Young GCs.
Most JVMs use a proactive interruption mechanism: when GC needs to pause a thread, it sets a safepoint flag. Running threads periodically check this flag and voluntarily stop when it is set.
Why does heap memory decrease?
We observed a reduction in heap memory usage.
PS MarkSweep uses a mark‑and‑sweep algorithm, while PS Scavenge uses a copying algorithm. GC reclaims memory that no longer has references, which reduces overall heap occupancy.
Reflection
In gateway‑type systems, frequent Young and Full GCs cause stop‑the‑world pauses that can degrade caller requests and, in extreme cases, lead to 502 errors. Therefore, when designing systems, it is advisable to minimize both Young and Full GC occurrences.
JD Retail Technology
Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.