Mastering Redis RDB Snapshots: SAVE vs BGSAVE Explained
This article explains how Redis RDB snapshots work, compares the blocking SAVE command with the asynchronous BGSAVE command, outlines their impact on performance, and provides best‑practice guidance on snapshot frequency and trade‑offs.
1. Introduction
When a Redis cache crashes, all in‑memory data is lost, I/O shifts from memory to disk, and requests bypass the cache and hit the database, potentially causing a snowball effect. To avoid data loss and ensure fast recovery, Redis offers two persistence mechanisms: RDB snapshots and AOF logs. This section focuses on RDB snapshots.
2. What Is an RDB Memory Snapshot?
In large‑scale, high‑concurrency distributed systems, a Redis outage can force all traffic to the database, creating severe pressure or a cascade failure. Since Redis stores data in memory, a restart leaves the cache empty, and rebuilding it can overload the database.
Persisting data to disk ensures that a restart does not lose data, but writing to disk on every change would be costly and hurt performance. Therefore, Redis provides a snapshot strategy that periodically writes the in‑memory dataset to disk.
2.1 Using Memory Snapshots
Redis takes snapshots at configured intervals, writing the dataset to a file that can be loaded on restart, similar to saving a game state. The following diagram illustrates the process:
When a failure occurs, Redis can restore data from the most recent snapshot (e.g., using the 21:00 snapshot for a 21:10 failure).
2.2 Generating RDB Snapshots
Redis provides two commands to create RDB files: SAVE and BGSAVE . Both generate a snapshot, but they differ in execution and impact.
The SAVE command blocks the Redis server process until the RDB file is created. During this time, Redis cannot serve other commands, which can severely affect performance for large datasets.
The BGSAVE command forks a child process to perform the snapshot asynchronously, allowing the main process to continue handling client requests. However, forking consumes additional memory.
2.2.1 SAVE Mode
SAVE runs in the main process, blocking network I/O and key‑value operations, which can degrade client responsiveness. It is generally discouraged for production workloads.
2.2.2 BGSAVE Mode
BGSAVE forks a child process using glibc, so the main process remains responsive. The child writes the RDB file while the parent continues to serve commands such as GET and SET. This is the default and recommended approach.
The execution flow is as follows:
Execute BGSAVE ; the master process checks for an existing RDB/AOF child process and returns immediately if one is running.
The master forks a child process; during the fork, the parent briefly blocks. Use INFO stats and the latest_fork_usec option to view the fork duration.
After forking, the command returns Background saving started , and the child proceeds asynchronously while the client can still issue GET, SET, etc.
The forked child creates the RDB file, writes a temporary snapshot, then atomically replaces the old file. The LASTSAVE command (or rdb_last_save_time ) shows the timestamp of the most recent snapshot.
The child signals the master upon completion; the master updates its statistics.
This process ensures snapshot integrity without blocking normal operations.
2.2.3 Avoiding Excessive Full Snapshots
Even though BGSAVE runs in the background, taking full snapshots too frequently incurs significant overhead:
Frequent disk writes increase I/O pressure and consume storage space.
Forked child processes share resources with the master, potentially degrading the master’s performance.
2.3 Summary
RDB snapshots provide fast recovery, but the snapshot interval must be balanced: too infrequent snapshots risk larger data loss, while too frequent snapshots add overhead and reduce Redis performance.
Advantages:
RDB files are compact binary formats with compression, resulting in lightweight files.
Fast data restoration, ideal for disaster recovery, loads much quicker than AOF.
Disadvantages:
Not real‑time persistence; each snapshot requires a forked process, which can be costly if done often.
Binary format may be incompatible across Redis versions.
Data recovery is not complete; there is always a gap between the snapshot time and the failure time.
Architecture & Thinking
🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.