Backend Development 19 min read

How to Keep Cache and Database Consistent? Proven Strategies and Common Pitfalls

This article explains why cache‑DB consistency is a long‑standing challenge, compares naive full‑load and delete‑cache approaches, analyzes concurrency and failure scenarios, and presents reliable solutions such as updating the database first followed by cache deletion using message queues or binlog subscriptions.

Sanyou's Java Diary

Jun 17, 2024

How to Keep Cache and Database Consistent? Proven Strategies and Common Pitfalls

Why Cache‑DB Consistency Matters

When a system grows, reading directly from the database becomes a performance bottleneck, so a cache (commonly Redis) is introduced to speed up reads. However, once data is stored in both the database and the cache, keeping them consistent becomes a critical issue.

Naïve Full‑Load Cache Strategy

The simplest method is to load all data into the cache without expiration, write requests only update the database, and a scheduled task periodically refreshes the cache.

Database data is fully flushed to the cache (no TTL).

Write requests update only the database.

A background job copies database data to the cache at intervals.

Advantages: read requests always hit the cache, delivering high performance. Drawbacks: low cache utilization (stale, rarely‑accessed data stays in cache) and data inconsistency because the cache is refreshed only on a schedule.

Improving Cache Utilization and Consistency

To maximize cache utilization, store only hot data: write requests continue to update the database, read requests first check the cache, fall back to the database on a miss, rebuild the cache entry, and set an expiration time for each cached item.

This ensures that rarely accessed data expires automatically, leaving only frequently accessed data in the cache.

Ensuring Consistency When Updating Data

When a record is modified, both the database and the cache must be updated. Two ordering options exist:

Update the cache first, then the database.

Update the database first, then the cache.

Either order can cause inconsistency if the second step fails (e.g., cache updated but DB write fails, or DB updated but cache update fails).

Concurrency Problems

Concurrent updates can lead to stale data regardless of the chosen order. For example, two threads updating the same record can leave the cache with an older value while the database holds the newer one.

Making Both Steps Reliable

Simply retrying the failed step synchronously is inefficient. A better approach is asynchronous retry: place the retry request into a message queue and let a dedicated consumer handle it until it succeeds.

Alternatively, use a message queue to perform the cache update after the database transaction commits, decoupling the two operations.

Database Change Log Subscription

Instead of writing to a queue, one can subscribe to the database’s change log (e.g., MySQL binlog) using tools like Canal. When a row changes, the listener deletes or updates the corresponding cache entry, guaranteeing that the cache reflects the authoritative source.

Master‑Slave Replication Delay and Delayed Double Delete

In read‑write separation setups, replication lag can cause stale reads from a replica, leading to cache inconsistency. The “delayed double delete” strategy mitigates this by deleting the cache twice: once immediately after the DB update and again after a short delay (or via a delayed message).

Choosing an appropriate delay is difficult because it must exceed both replication lag and the time a concurrent read‑write thread might take to repopulate the cache.

Can Strong Consistency Be Achieved?

Strong consistency across cache and database typically requires heavyweight protocols (2PC, Paxos, Raft) that sacrifice performance. In practice, systems accept eventual consistency, using TTLs and the techniques above to minimize the window of inconsistency.

Key Takeaways

Introduce a cache to improve performance, but plan for consistency challenges.

Prefer the “update database then delete cache” pattern, combined with asynchronous retry via a message queue or binlog subscription.

Avoid the naïve “update both synchronously” approach in high‑concurrency scenarios.

When using delayed double delete, carefully tune the delay to cover replication lag and concurrent access.

Conclusion

Cache‑DB consistency cannot be guaranteed perfectly without sacrificing performance. By adopting asynchronous mechanisms and accepting eventual consistency, developers can achieve high throughput while keeping inconsistency risks low.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Redis Cache Consistency eventual consistency Database Synchronization

Written by

Sanyou's Java Diary

Passionate about technology, though not great at solving problems; eager to share, never tire of learning!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.