Redis Clustering Techniques and Codis: Architecture, Performance Comparison, and Practical Tips
This article reviews common Redis clustering methods, compares Twemproxy and Codis, presents Codis’s architecture and performance test results, and offers migration, HA, pipeline, and operational guidance for using Codis as a Redis distributed middleware solution.
1. Common Redis Clustering Techniques
Historically, Redis only supports a single instance with limited memory (10‑20 GB), which cannot meet the demands of large‑scale online services and leads to low resource utilization on servers with 100‑200 GB memory.
To overcome single‑node limitations, many internet companies have built self‑service clustering solutions that shard data across multiple Redis instances, each shard typically being a separate Redis instance.
Redis now offers an official Redis Cluster, and there are three main clustering mechanisms:
1.1 Client‑Side Sharding
This approach places the sharding logic in the application code, which routes requests to multiple Redis instances based on predefined routing rules. It gives developers full control and flexibility but requires manual adjustment when instances are added or removed, making operations harder and less suitable for small teams without strong DevOps support.
1.2 Proxy Sharding
In this model, a dedicated proxy program handles sharding. The proxy receives client requests, applies routing rules, forwards them to the appropriate Redis instances, and returns the responses. This reduces the burden on application code and simplifies operations, though it introduces a performance overhead. Twemproxy is a widely used open‑source example of this approach.
1.3 Redis Cluster
Redis Cluster is a decentralized solution without a central proxy. It maps all keys to 16 384 slots, distributes slots among cluster nodes, and lets the client automatically redirect requests to the correct node if the data is not on the initially contacted instance. While robust, it is more complex and currently sees limited adoption in production.
2. Twemproxy and Its Limitations
Twemproxy, an open‑source proxy sharding solution from Twitter, forwards client requests to backend Redis servers based on routing rules. It solves single‑node capacity issues but introduces a single point of failure, requiring external high‑availability solutions like Keepalived.
Key pain points of Twemproxy include difficulty with smooth scaling (adding or removing Redis nodes) and lack of an operational control panel, making it cumbersome for operators.
3. Codis Practice
Codis, open‑sourced by Wandou Labs in 2014, is a Go/C‑based Redis distributed middleware that addresses Twemproxy’s shortcomings and adds many useful features. Internal tests show that Codis’s stability meets high‑availability requirements and its performance has improved from being 20 % slower than Twemproxy to nearly 100 % faster under certain conditions.
3.1 Architecture
Codis introduces the concept of a Group, consisting of one Redis master and at least one slave. This enables seamless master‑failover via a dashboard without changing application configuration.
Codis modifies the Redis server source (Codis Server) to support hot data migration (auto‑rebalance). It uses a pre‑sharding scheme with 1 024 slots, stored in ZooKeeper, which also maintains group information and provides distributed locks.
3.2 Performance Comparison Tests
Three‑month benchmark tests using redis‑benchmark compared Codis and Twemproxy across value sizes from 16 B to 10 MB. Four physical servers were used, with separate deployments for Codis and Twemproxy clusters.
Results show that for Set operations with value length < 888 B, Codis outperforms Twemproxy, and for Get operations Codis consistently performs better. Graphs illustrate these findings.
3.3 Usage Tips and Precautions
Key practical tips include:
1) Seamless Migration from Twemproxy
Codis provides a Codis‑port tool to sync data from an existing Twemproxy setup to a Codis cluster, after which only the proxy address in the application configuration needs to be changed.
2) Java HA Support
Codis offers a Java client called Jodis that automatically detects and bypasses failed Codis proxies, ensuring high availability for Java applications.
3) Pipeline Support
Pipeline allows batching multiple requests, significantly boosting Set performance for values < 888 B and also improving Get performance, as shown in the benchmark graphs.
4) Codis Does Not Handle Master‑Slave Sync
Codis only manages the list of Redis servers; data consistency between master and slave must be ensured by operators, which keeps Codis lightweight and suitable for production.
5) Future Expectations
Users hope Codis will remain lightweight and improve pipeline performance for larger values, as current tests show a slowdown compared to Twemproxy for large payloads.
For more details, see the Codis source repository and documentation.
Art of Distributed System Architecture Design
Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.