Fundamentals 12 min read

Full GC Root Cause Analysis and Resolution in Java Applications

This article documents a step‑by‑step investigation of a high TP99 caused by frequent Full GC in a Java service, describing the diagnostic mindset, tools used, GC trigger conditions, object promotion mechanisms, the impact of AdaptiveSizePolicy and Metaspace, and the concrete configuration and code changes that eliminated the issue.

JD Tech
JD Tech
JD Tech
Full GC Root Cause Analysis and Resolution in Java Applications

The article records a Full GC investigation that led to an abnormally high TP99, aiming to provide newcomers with a clear troubleshooting mindset and practical steps.

Because most production servers lack SSH access, the investigation relied on platform‑provided tools such as JDOS container monitoring, JDOS process query, SGM container monitoring, and SGM method call query:

JDOS容器智能监控: 查看容器的CPU,内存,磁盘,IO等信息
JDOS进程查询: 查看Java进程编号,执行常用的Java内存进程查看命令
SGM容器监控信息: 查看JVM虚拟机内存变更历史记录
SGM方法调用查询: 查看某一次关键接口调用的上下依赖,时间分布

The problem originated from occasional interface timeouts observed around 10 am, which were traced back to frequent Full GC events.

Full GC Trigger Conditions identified were:

Explicit System.gc() call (not present).

Old generation space shortage.

Metaspace (method area) shortage.

Old generation occupancy after Minor GC exceeds its capacity.

Large objects being promoted directly to the old generation.

Monitoring showed the old generation repeatedly reaching 90 % occupancy at the same timestamps as Full GC spikes.

Object Promotion Scenarios examined included:

Age‑based promotion after surviving 15 Minor GCs.

Large objects exceeding -XX:PretenureSizeThreshold .

Dynamic age judgment when Survivor space usage exceeds 50 %.

Space guarantee mechanism moving objects directly to the old generation when Eden is too full.

Using JMAT (Eclipse Memory Analyzer) on heap dumps helped rule out large objects and long‑lived static maps.

The investigation pinpointed two key causes:

Enabled -XX:+UseAdaptiveSizePolicy caused premature promotion of objects to the old generation, accelerating old‑gen growth and triggering Full GC.

Metaspace growth (up to ~300 MB) also forced Full GC.

Solutions applied:

-XX:-UseAdaptiveSizePolicy

and adding JVM flags to trace class loading/unloading and GC details:

-XX:+TraceClassUnloading -XX:+TraceClassLoading -XX:+PrintGCDetails

After disabling AdaptiveSizePolicy and fixing a frequent class‑loading pattern (the com.googlecode.aviator.Expression rule‑engine class), Full GC frequency dropped from hourly to negligible.

Final recommendations emphasize narrowing down direct causes, asking “why” at each clue, and using the presented Full GC troubleshooting flowchart as a reference.

JavaJVMMetaspaceMemoryManagementPerformanceTuningFullGCAdaptiveSizePolicy
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.