Analyzing a JVM Memory Leak Caused by a Custom LRU Cache Implementation
This article walks through a production JVM memory‑leak incident, detailing how a static LRU cache built on LinkedHashMap caused unreclaimed objects, the concurrency pitfalls of its design, and practical steps to diagnose and fix such leaks in Java backend systems.
JVM‑related exceptions are a common pain point for developers because the JVM runtime behaves like a black box, making it hard to pinpoint issues when they arise.
In this article, a recent production memory‑leak incident is examined, starting with an alarm triggered by high old‑generation heap usage on a live server.
Initial observations show the old‑generation memory growing steadily after mid‑July without being reclaimed, indicating objects that cannot be garbage‑collected.
The troubleshooting steps include obtaining a heap dump, using MAT (or manual analysis due to dump size), and identifying suspicious objects such as Point and GeoDispLocal that have millions of instances.
These objects are stored in a custom static CacheMap , which is suspected to be the leak source. The article presents the implementation of CacheMap and its underlying LRUMap that extends LinkedHashMap with a fixed capacity and read‑write locks.
private static final CacheMap
> NEAR_DISTRICT_CACHE = new CacheMap<>(3600 * 1000, 1000);
private static final CacheMap
LOCAL_POINT_CACHE = new CacheMap<>(3600 * 1000, 6000);The CacheMap class holds the expiration time and an internal LRUMap :
public class CacheMap
{
private final long expireMs;
private LRUMap
> valueMap;
// other members omitted
}The LRUMap extends LinkedHashMap and overrides removeEldestEntry to enforce a maximum size, while using a ReadWriteLock to protect get and put operations:
public class LRUMap
extends LinkedHashMap
{
private final int maxCapacity;
private final ReadWriteLock lock = new ReentrantReadWriteLock();
public LRUMap(int maxCapacity) {
super(maxCapacity, 0.99f, true);
this.maxCapacity = maxCapacity;
}
@Override
protected boolean removeEldestEntry(Map.Entry
eldest) {
return size() > maxCapacity;
}
@Override
public V get(Object key) {
try { lock.readLock().lock(); return super.get(key); }
finally { lock.readLock().unlock(); }
}
@Override
public V put(K key, V value) {
try { lock.writeLock().lock(); return super.put(key, value); }
finally { lock.writeLock().unlock(); }
}
// remove, clear omitted
}Although the design aims to prevent expansion and handle concurrency, the overridden removeEldestEntry and the LRU behavior of LinkedHashMap.get() cause write‑like modifications during reads, leading to race conditions that break the linked list and result in memory leaks.
The analysis explains how concurrent get operations can interleave, causing node pointers to become inconsistent and preventing proper removal of entries, which ultimately leaks memory.
To fix the issue, the article suggests replacing the read‑write lock with a mutual‑exclusion lock or using an external distributed cache, thereby avoiding the flawed custom LRU implementation.
Overall, the piece demonstrates a systematic approach to diagnosing JVM memory leaks and highlights the pitfalls of custom LRU caches in multithreaded environments.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.