Cache Strategies: Consistency, Penetration, Avalanche and Common Patterns
This article explains why high‑frequency database reads become a performance bottleneck, introduces common caching patterns such as Cache‑Aside, Read‑Through, Write‑Through and Write‑Behind, discusses consistency challenges, and provides practical solutions for cache penetration and avalanche scenarios.
In many applications, frequent disk reads from a database become a performance bottleneck, especially under high request volumes, potentially causing system stalls or crashes.
To alleviate pressure on the database, a caching layer is typically placed between the business system and MySQL, reducing direct disk I/O.
However, real‑world cache usage involves several challenges. The article outlines classic cache scenarios and the associated problems.
1. Cache‑Database Consistency Issues
Common caching mechanisms include Cache Aside, Read Through, Write Through, Write Behind Caching . The Cache‑Aside pattern works as follows:
Cache Hit: Data is retrieved directly from the cache.
Cache Miss: The system reads from the database, then populates the cache.
Cache Update: After a write to the database, the corresponding cache entry is invalidated.
Even with this pattern, race conditions can cause stale (dirty) data, as described in a USENIX paper linked in the original text.
Read‑Through always reads from the cache; on a miss, it fetches from the database, updates the cache, and returns the data. This improves code readability but requires custom plugins.
Write‑Through updates the cache first (if hit) and then the database, ensuring both stay in sync.
Write‑Behind writes data to the cache and asynchronously propagates changes to the database, reducing write pressure but risking data loss if the cache node fails.
2. Cache Penetration
High concurrency can cause many requests to miss the cache and flood the database, leading to overload or crashes. Common mitigations include:
Null‑Value Caching: Store keys with empty results in the cache to avoid repeated database hits.
Bloom Filter: Pre‑filter non‑existent keys using a Bloom filter before querying Redis or the database.
References to a Bloom filter tutorial are provided.
3. Cache Avalanche
When many cache entries expire simultaneously or a cache server restarts, a sudden surge of database queries can occur. Prevention strategies include:
Using distributed locks to ensure only one request repopulates the cache after expiration.
Pre‑warming the cache before peak traffic.
Staggering TTLs to avoid synchronized expirations.
Combining master‑slave + Sentinel or Redis Cluster for high availability.
Employing local Ehcache as a fallback and Hystrix for rate‑limiting and degradation to protect MySQL.
The article concludes with a call to share the post and contact information for further technical discussions.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.