Cache Consistency, Concurrency, Penetration, Avalanche, and Bottomless Pit Issues and Mitigation Strategies
The article explains various cache problems—including consistency, concurrency, penetration, avalanche, and the bottomless‑pit phenomenon—and presents practical mitigation techniques such as active updates, locking, empty‑object caching, request filtering, consistent hashing, and multi‑level caching to ensure reliable high‑performance systems.
Cache Consistency Issues
When data freshness is critical, the cache must stay consistent with the database and with replica nodes, preventing any divergence. This relies on expiration and update policies, typically updating or evicting cache entries immediately when the underlying data changes.
Cache Concurrency Issues
After a cache entry expires, multiple concurrent requests may simultaneously query the backend database, causing a massive load and potentially an "avalanche". Additionally, while a cache key is being updated, many requests may still try to read it, leading to consistency problems. A common solution is to use a lock: acquire a lock before updating or fetching from the database, release it after completion, and let other requests wait briefly before reading the refreshed cache.
Cache Penetration Issues
Cache penetration (sometimes called "breakthrough") is often misunderstood as a cache failure or expiration causing massive requests to hit the database. In reality, it occurs when a hot key is repeatedly queried, not found in the cache, and the underlying data itself is empty, leading to unnecessary database queries under high concurrency.
Common mitigation methods include:
1. Cache Empty Objects
Cache empty results (e.g., an empty collection or a placeholder object) to prevent repeated database hits for non‑existent data, while still managing the cache’s TTL.
2. Dedicated Filtering
Intercept requests for keys known to have empty results and handle them uniformly, avoiding database access; this approach is more complex but suitable for low‑hit, infrequently updated data.
Cache Bouncing Issues
Also known as "cache jitter", this lighter‑weight fault can still impact system performance for a period, often caused by cache node failures. The industry recommends using consistent hashing to mitigate the problem.
Cache Avalanche Phenomenon
A cache avalanche occurs when a large number of requests bypass a failing or expired cache and flood the backend database, potentially causing a system‑wide crash. It can be triggered by the previously described concurrency, penetration, or bouncing problems, as well as by malicious attacks or synchronized cache expirations. Preventive measures include staggering TTLs, rate limiting, degradation, circuit breaking, and employing multi‑level caches.
From an engineering‑process perspective, thorough stress testing that mimics real‑world traffic helps expose and prevent such issues early.
Cache Bottomless Pit Phenomenon
This issue was reported by Facebook engineers when scaling memcached to thousands of nodes; connection frequency caused performance degradation that persisted even after adding more nodes, leading to the "bottomless pit" effect.
Modern stacks (databases, caches, NoSQL, search middleware) use sharding to meet high performance, concurrency, availability, and scalability requirements. Sharding can be client‑side (hash or range) or server‑side, but each additional node increases network overhead.
Key mitigation strategies include:
1. Data Distribution Method
Choose hash‑based or range‑based distribution according to workload characteristics to reduce network I/O.
2. I/O Optimization
Leverage connection pools, NIO, and similar techniques to lower connection costs and improve concurrent handling.
3. Data Access Pattern
Fetching larger data sets in a single request can be more efficient than many small requests due to reduced I/O overhead.
In most companies, the bottomless‑pit scenario is rare.
Source: https://blog.csdn.net/dinglang_2009/article/details/53464196
Copyright Notice: Content originates from the web; rights belong to the original author. We credit the author and source unless verification is impossible. If any infringement is identified, please inform us for immediate removal.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.