Cache Design and Optimization Strategies for Distributed Systems
This article explains the benefits, costs, and various update, penetration, hole, avalanche, and hot‑key optimization techniques for caches in high‑concurrency distributed systems, providing practical guidance on choosing appropriate strategies such as LRU/LFU/FIFO, timeout eviction, proactive updates, Bloom filters, and concurrency‑aware batch operations.
Cache Benefits and Costs
Using a cache brings two main benefits: accelerated read/write performance because caches like Redis or Memcached operate entirely in memory, and reduced backend load as cached results lower CPU, I/O, and thread usage. However, caches also introduce costs such as data inconsistency due to expiration windows, increased code maintenance, and higher operational overhead for high‑availability setups.
Cache Update Strategies
Cache entries usually have a TTL and expire to keep data consistent and free space. Three common update policies are discussed:
1. LRU / LFU / FIFO
These eviction algorithms are used when the cache is full: LRU removes the least recently used item, LFU removes the least frequently accessed, and FIFO removes the oldest. They have low development cost but provide weak consistency because the algorithm decides which data to evict.
2. Timeout Eviction
Setting an explicit expiration time (e.g., Redis EXPIRE ) lets the cache automatically discard stale data. Consistency depends on the TTL; it is generally acceptable for scenarios that can tolerate brief staleness, such as promotional content.
3. Proactive Update
When the underlying data changes, the cache is updated immediately, offering the highest consistency. This approach couples business updates with cache updates and often uses message queues to decouple them; it may be combined with timeout eviction for fault tolerance.
Best practice: low‑consistency workloads can use LRU/FIFO (option 1) possibly combined with timeout eviction; high‑consistency workloads should combine proactive updates with timeout eviction.
Cache Penetration Optimization
Cache penetration occurs when requests query non‑existent data, causing both cache and storage misses and potentially overwhelming the backend. Two mitigation techniques are presented:
1. Cache Empty Objects
Cache a placeholder for missing keys to block repeated useless requests. To avoid memory waste, limit the scope of valid keys and set a short TTL for empty objects.
2. Bloom Filter
A Bloom filter is a space‑efficient probabilistic data structure that quickly determines whether a key is likely absent, preventing unnecessary backend lookups. It works well for large static key sets; for real‑time data, the filter must be actively refreshed, incurring additional maintenance cost.
No‑Hole (Cache Hole) Optimization
The “no‑hole” problem arises when batch operations (e.g., MGET ) must contact many nodes, increasing network latency and load. Four approaches are compared:
(1) Serial MGET
Execute N individual GET commands, resulting in N network round‑trips.
(2) Serial I/O
Group keys by node using the hash function, then issue one GET per node, reducing network trips to the number of nodes.
(3) Parallel I/O
Perform the node‑wise gets concurrently (e.g., Java parallel streams), keeping the number of network trips low while achieving near‑O(1) latency.
(4) Hash‑Tag
Use Redis hash‑tags (e.g., {group}key ) to force related keys onto the same node, allowing a single MGET to retrieve all of them.
A comparison chart (original image) summarizes the trade‑offs.
Cache Avalanche Optimization
An avalanche happens when a cache outage forces a massive surge of requests to the storage layer. Mitigation includes ensuring high cache availability (master‑slave, Sentinel), using circuit‑breaker/limiters (e.g., Netflix Hystrix), and isolating projects to contain failures.
Hot‑Key Rebuild Optimization
When a hot key expires, many threads may simultaneously rebuild it, overloading the backend. Solutions include:
Mutex lock – only one thread rebuilds while others wait or serve stale data (e.g., Guava’s refreshAfterWrite ).
Never‑expire – update the cache via scheduled jobs or on‑write‑through.
Backend rate limiting – limit the number of rebuild requests that reach the storage.
These techniques assume the hot key can be identified in advance.
Conclusion
The author encourages readers to like, share, and follow the “Code Monkey Technical Column” public account for additional PDFs on Spring Cloud, Spring Boot, and MyBatis, and to join the technical discussion group.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.