Backend Development 15 min read

Cache Design and Optimization Strategies in High‑Concurrency Distributed Systems

This article examines the role of caching in high‑concurrency distributed systems, outlining its performance benefits and associated costs, and detailing various cache‑update strategies—including LRU/LFU/FIFO, expiration, active refresh, Bloom‑filter protection, and solutions for penetration, avalanche, hot‑key rebuild, and the “no‑bottom‑hole” problem.

Architecture Digest

Jan 30, 2022

Cache Design and Optimization Strategies in High‑Concurrency Distributed Systems

In high‑concurrency distributed systems, caching is indispensable for accelerating read/write operations and preventing a flood of requests from overwhelming the underlying storage layer.

Cache Benefits and Costs

Performance boost: In‑memory caches such as Redis or Memcached can handle tens of thousands of QPS, far exceeding traditional databases like MySQL.

Reduced backend load: Caching expensive computations or query results lowers CPU, I/O, and thread consumption.

Data inconsistency: Cached data may become stale within the expiration window.

Maintenance overhead: Developers must manage both cache and storage logic, increasing code complexity.

Operational cost: High‑availability setups (master‑slave, clustering) add operational complexity.

When the benefits outweigh the costs, adopting a cache is justified.

Cache Update Strategies

Cache entries usually have a TTL; once expired they are reloaded from the data source. Common update policies include:

LRU / LFU / FIFO : Eviction algorithms used when the cache is full. LRU removes the least recently accessed item, LFU removes the least frequently accessed, FIFO follows insertion order. These methods are low‑cost to implement but provide limited consistency guarantees.

Timeout eviction : Manually set an expiration time (e.g., Redis EXPIRE command). Consistency depends on the TTL; real‑time consistency is not guaranteed.

Active refresh : When the data source changes, the cache is proactively updated. This offers the highest consistency but couples business updates with cache updates, often requiring a message‑queue‑based decoupling.

Best practices:

Low‑consistency workloads: combine LRU/LFU/FIFO with timeout eviction.

High‑consistency workloads: combine timeout eviction with active refresh.

Penetration Optimization

Cache penetration occurs when queries request non‑existent data, causing both cache and storage misses. Mitigation techniques:

Cache empty objects : Store a placeholder for missing keys and give it a short TTL. This reduces repeated miss traffic but can waste memory if many distinct missing keys are cached.

Bloom filter : A space‑efficient probabilistic data structure that quickly determines whether a key is definitely absent. It can be combined with local memory caches to filter out invalid requests before they reach Redis or Memcached.

No‑Bottom‑Hole Optimization

The “no‑bottom‑hole” problem arises when distributed hash‑based key placement causes batch operations (e.g., MGET) to contact many nodes, increasing latency and connection overhead. Solutions include:

Avoid batch operations when possible.

Isolate clusters per project/team to limit node count.

Optimize I/O: command efficiency, connection pooling, NIO, and merging requests.

Use hash‑tags to force related keys onto the same node, reducing network hops.

Avalanche Optimization

A cache avalanche happens when a large portion of cached data expires simultaneously, flooding the storage layer. Prevention measures:

Ensure high cache availability (e.g., master‑slave, Redis Sentinel).

Apply rate limiting and circuit‑breaker patterns (e.g., Netflix Hystrix) to protect downstream services.

Isolate resources per project to contain failures.

Hot‑Key Rebuild Optimization

When a hot key expires, many threads may attempt to rebuild the cache concurrently, overwhelming the backend. Mitigation strategies:

Mutex lock : Allow only one thread to rebuild; others wait or serve stale data.

Never‑expire pattern : Update cache asynchronously via scheduled jobs or event‑driven pushes.

Backend rate limiting : Limit the number of rebuild requests that reach the backend, letting successful rebuilds populate the cache for subsequent requests.

In summary, effective cache design requires balancing performance gains against consistency and operational costs, and applying appropriate eviction, update, and protection mechanisms based on workload characteristics.

Conclusion

If you found this article helpful, feel free to like or follow the author.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems Performance Caching high concurrency bloom-filter Cache Eviction

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.