Databases 13 min read

Optimizing Throughput and Latency in Large‑Scale Redis Cluster Deployments

This article examines the performance challenges of a 700‑node Redis Cluster used by Youku, analyzes bandwidth and latency impacts of cluster communication, and presents practical tuning methods—including adjusting cluster‑node‑timeout, reducing replicas, disabling AOF, limiting cluster‑nodes commands, and tuning the hz parameter—to improve throughput and stability in massive Redis deployments.

High Availability Architecture
High Availability Architecture
High Availability Architecture
Optimizing Throughput and Latency in Large‑Scale Redis Cluster Deployments

In Youku's "Blue Whale" project a Redis Cluster with over 700 nodes (approaching the author‑suggested limit of 1,000) is used as a full‑memory temporary storage system for cookies and big‑data calculations, where all data have expiration times.

As the cluster grew, bandwidth pressure and response time (RT) increased, revealing that Redis Cluster is more suitable for medium‑scale deployments and that communication overhead becomes a bottleneck at large scale.

Redis Cluster Architecture

Redis runs a single‑process event loop (Reactor pattern) handling client requests, cluster communication, slave synchronization, AOF persistence and periodic tasks. Heavy operations include client requests, cluster messaging, slave sync, AOF file handling, and other timed jobs.

The cluster uses a gossip protocol for state exchange: each node periodically (default every second) sends a 16 KB bitmap plus a 104‑byte node‑status payload to a random peer. For 700 nodes this results in roughly 9 KB per gossip message and about 7 KB of node‑status data per exchange.

Experimental Measurement

An experiment with 704 nodes on 40 physical machines (16 nodes per host) measured inbound and outbound traffic on the cluster communication port over a 60‑second window. Results showed >2,700 packets per host, far exceeding the expected 60 packets, due to random peer selection and the need to ensure timely state convergence.

Bandwidth per host was calculated by summing inbound and outbound traffic, yielding approximately 107.5 Mbit/s for cluster communication and about 45 Mbit/s for front‑end requests.

Increasing cluster-node-timeout from 20 s to 30 s reduced total bandwidth by ~50 Mbit/s, demonstrating the impact of timeout settings on network load.

Key Findings

Cluster communication consumes a large portion of bandwidth in massive Redis deployments.

Raising cluster-node-timeout can significantly lower bandwidth usage, but overly large timeouts delay failover detection.

Fail‑over Timing

The time to mark a node as FAIL depends on the timeout and the probability of receiving enough PFAIL reports; with a 30 s timeout, a 700‑node cluster may take up to ~55 s to declare a node failed.

Performance Tuning Recommendations

Increase cluster-node-timeout moderately to reduce gossip traffic, while balancing fail‑over latency.

Limit the number of replicas (use a 1‑master‑1‑slave configuration) to lower write propagation overhead.

Disable AOF persistence in high‑throughput scenarios and rely on periodic BGSAVE on slaves for durability.

Avoid frequent CLUSTER NODES commands; they can add ~100 KB of traffic per request at large scale.

Leave the hz parameter at its default (10) unless you have deep expertise, as changes have limited impact on throughput.

References include the official Redis Cluster tutorial and specification.

performancescalabilityHigh AvailabilityRedistuningRedis Cluster
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.