Understanding and Troubleshooting Out‑of‑Memory Issues in Java Applications
This article explains the causes of Java out‑of‑memory problems, introduces essential diagnostic commands and memory‑analysis tools, and walks through a real‑world case study that shows how to locate, analyze, and fix a memory leak caused by unreleased OSS client connections.
Out of memory (OOM) occurs when a Java application exhausts its allocated heap, often due to high concurrency, large data loads, unreleased resources, infinite loops, third‑party bugs, or insufficient JVM start‑up parameters.
Common diagnostic commands:
jps : displays the process IDs of all running Java processes.
jstat : shows heap usage, class loading statistics, and garbage‑collection activity.
jstack : prints Java thread stack traces, useful for detecting deadlocks.
jmap : dumps a heap snapshot for offline analysis.
Memory‑analysis tools:
MemoryAnalyzer (Eclipse MAT): a powerful standalone or Eclipse‑integrated heap analyzer.
JProfiler : commercial profiler that integrates with IDEs such as IntelliJ IDEA.
jconsole : a GUI tool for monitoring JVM metrics, including remote VMs.
Case study – diagnosing a production OOM incident:
1. At 19:30, a batch server (PID 18713) showed CPU usage near 90 % and memory usage around 60 %.
2. Initial suspicion fell on code loops or excessive request threads, but traffic was low and no code changes had been deployed.
3. Using top -p 18713 identified four high‑CPU threads; their IDs (in hex) were 0x4922‑0x4925.
4. jstack -l 18713 > a.txt revealed that these threads were JVM garbage‑collection threads, not business logic threads.
5. jstat -gc 18713 5000 showed frequent Full GC events, confirming GC pressure.
6. A heap dump was generated with jmap -dump:file=dump.hprof 18713 and opened in MemoryAnalyzer.
7. Analysis highlighted that >90 % of retained heap was occupied by HTTP connection objects originating from the OSS client library.
8. Code review confirmed that the ossClient instances were not closed after use, causing a memory leak that also increased GC activity and CPU load.
9. After fixing the code to properly close the OSS client, the batch job was restarted and the OOM issue disappeared.
Summary: Quickly locating OOM requires a systematic approach—monitor alerts, identify suspect processes, use jps , jstat , jstack , and jmap , then analyze the heap with tools like MAT. Always ensure resources such as OSS clients are closed to prevent leaks.
Images illustrating commands and analysis steps are retained from the original article.
360 Quality & Efficiency
360 Quality & Efficiency focuses on seamlessly integrating quality and efficiency in R&D, sharing 360’s internal best practices with industry peers to foster collaboration among Chinese enterprises and drive greater efficiency value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.