Master‑Slave, Sentinel, and Cluster: Unlocking Redis High Availability
This guide explains Redis high‑availability mechanisms, covering master‑slave replication, the Sentinel monitoring and automatic failover process, and the Redis Cluster sharding architecture, including hash slots, MOVED/ASK redirection, gossip communication, and practical considerations such as data consistency, network partitions, and slot allocation.
Redis High‑Availability Overview
1. Master‑Slave Replication
Redis achieves high availability by deploying one master and one or more slaves. The master handles all writes and can also serve reads; slaves are read‑only replicas. If the master fails, a slave can be promoted.
1.1 Replication Stages
Stage 1 – Connection & negotiation
Slave sends PSYNC to the master. Master replies with FULLRESYNC , its runID and current offset .
Stage 2 – Data transfer
Master runs BGSAVE to create an RDB file and streams it to the slave. Slave clears its DB, loads the RDB, and buffers any writes that occur during transfer.
Stage 3 – Incremental updates
After the RDB is loaded, the master forwards the buffered write commands; the slave re‑executes them to reach full synchronization.
1.2 Practical Considerations
Data Inconsistency
Replication is asynchronous; slaves may lag due to network latency or blocking commands (e.g., HGETALL). Mitigate by using faster hardware, reliable networks, and monitoring replication offset progress.
Expired‑Key Reads
Redis deletes expired keys lazily, periodically, or aggressively when memory is full. Versions prior to 3.2 may return expired values from slaves; from 3.2 onward slaves return nil. Use Redis ≥ 3.2 for correct behavior.
Master‑Slave Pressure in One‑Master‑Many‑Slaves
Full‑copy replication from many slaves can overload the master (forking RDB, network bandwidth). A “master‑slave‑slave” topology—designating a well‑connected slave as an intermediate replica—reduces load.
Network Partition Handling
Since Redis 2.8, reconnection uses the circular repl_backlog_buffer to perform incremental replication instead of a costly full resync.
2. Redis Sentinel
Sentinel automates monitoring, failover, and client notification for master‑slave setups.
2.1 Roles
Monitoring : Periodic PING to masters and slaves.
Automatic failover : Detects master down and coordinates a switch.
Notification : Publishes the new master address to slaves and client applications.
2.2 Architecture
Multiple Sentinel instances form a cluster, exchange state via publish/subscribe and the INFO command.
2.3 Down‑State Detection
Subjective down : A Sentinel marks a node down if it fails to answer PING within down-after-milliseconds.
Objective down : When a majority (quorum) of Sentinels agree on the subjective down state, the node becomes objectively down, preventing false positives.
2.4 Failover Workflow
All Sentinels ping known masters, slaves, and other Sentinels every second.
If a master is subjectively down, Sentinels confirm with additional pings.
When ≥ quorum Sentinels agree, the master is marked objectively down.
The Sentinel that first detected the objective down initiates a leader election via IS-MASTER-DOWN-BY-ADDR.
The elected leader must obtain num(sentinels)/2+1 affirmative votes and satisfy the configured quorum.
The leader performs the master‑slave switch and notifies clients.
2.5 Leader Election Example
With three Sentinels (A1, A2, A3) and quorum=2, A1 and A3 may both vote for themselves; A2 votes for the first request it receives, resulting in A3 becoming the leader.
2.6 Automatic Failover Steps
Slave S1 is promoted to master.
Slave S2 becomes a replica of the new master.
The former master, once recovered, joins as a replica.
Clients receive the new master address.
3. Redis Cluster
Cluster adds sharding to distribute data across many nodes, eliminating the memory waste of pure master‑slave replication and enabling online scaling.
3.1 Hash Slots
Redis Cluster divides the keyspace into 16 384 slots. A key’s slot is computed as CRC16(key) % 16384 (equivalently CRC16(key) & 0x3FFF). Each node owns a subset of slots; for three nodes the distribution could be 0‑5460, 5461‑10922, and 10923‑16383.
3.2 Redirection
If a client contacts a node that does not own the key’s slot, the node returns:
MOVED – the slot belongs to another node; the client must retry there.
ASK – the key is being migrated; the client sends the command to the target node with the ASKING flag.
3.3 Gossip Protocol
Nodes exchange health, slot ownership, and topology information via a gossip protocol (periodic PING, PONG, MEET, FAIL messages). This converges to a consistent view of the cluster.
3.4 Cluster Failover
When a master fails, its slaves are elected as new masters based on priority, replication offset, and node ID. The process includes qualification checks, election timing, vote collection, and replacement of the failed master.
3.5 Why 16 384 Slots?
The slot bitmap is stored as unsigned char slots[REDIS_CLUSTER_SLOTS/8]. With 16 384 slots the bitmap occupies ~2 KB per node, whereas 65 536 slots would require ~8 KB. The 2 KB overhead is acceptable for clusters up to ~1 000 nodes while keeping collision probability low (≈ 1/65 536). Using 8 192 slots would save memory but increase hash collisions, so 16 384 is a balanced choice.
unsigned int keyHashSlot(char *key, int keylen) {
int s, e; /* start‑end indexes of { and } */
for (s = 0; s < keylen; s++)
if (key[s] == '{') break;
if (s == keylen) return crc16(key,keylen) & 0x3FFF;
for (e = s+1; e < keylen; e++)
if (key[e] == '}') break;
if (e == keylen || e == s+1) return crc16(key,keylen) & 0x3FFF;
return crc16(key+s+1,e-s-1) & 0x3FFF;
}References
GeekTime, "Redis Core Technology and Practice".
Redis Advanced – High‑Scalable Sharding (Redis Cluster) tutorial.
Analysis of why Redis Cluster uses 16 384 slots.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
