Backend Development 23 min read

Design and Operation of Zhihu's Redis Platform: Architecture, High Availability, and Scaling

The article details Zhihu's internally built Redis platform, covering its architecture, instance types, high‑availability mechanisms, migration from client‑side sharding to Twemproxy, deployment on Kubernetes, scaling strategies, monitoring tools, and future upgrades, providing valuable insights for backend engineers.

High Availability Architecture
High Availability Architecture
High Availability Architecture
Design and Operation of Zhihu's Redis Platform: Architecture, High Availability, and Scaling

Zhihu's storage platform team built a Redis platform on top of the open‑source Redis component, creating a complete automated operation service system that offers one‑click deployment, automatic scaling, fine‑grained monitoring, and traffic analysis.

Given Zhihu's massive daily traffic, the platform must provide stable, low‑latency service; it now runs about 70 TB of memory (≈40 TB used), processes ~15 million requests per second on average (peak ~20 million), with ~10 trillion requests per day across ~800 clusters and ~16 000 Redis instances.

Instances are divided into Standalone (single‑node) and Cluster modes. Standalone instances use native master‑slave replication with Redis Sentinel for health checks and failover; Sentinel broadcasts a switch-master <master name> <oldip> <oldport> <newip> <newport> message that watchers use to update client connections.

Operational notes for Standalone include setting slave-priority to 0 for read‑only slaves, ensuring CONFIG REWRITE works after failover, limiting Sentinel groups to <300 instances per group, and configuring timeout parameters ( down-after-milliseconds = 30000, failover-timeout adjustable).

When capacity exceeds 20 GB or throughput exceeds 200 k req/s, Cluster mode is used. Initially Zhihu employed client‑side sharding via redis‑shard , which offered fast hashing but required language‑specific implementations and suffered from migration, expansion, and connection‑count issues.

Since 2015 Zhihu switched to Twemproxy (nutcracker) as its proxy‑based clustering solution. Twemproxy provides high performance, multiple hash algorithms (fnv1a_64, murmur, md5), and supports consistent hashing (ketama) as well as modula hashing. It offers two deployment modes:

Storage mode: uses fnv1a_64 + modula hashing, each shard runs a master‑slave pair with Sentinel for HA, no auto_eject_hosts so failed nodes are not automatically removed.

Cache mode: uses fnv1a_64 + ketama consistent hashing, each shard has a single master, and auto_eject_hosts removes nodes after a configurable number of failures (default 3 failures, retry after 10 minutes).

Twemproxy was first deployed on fixed physical machines with an Agent for health checks; later it was containerized and run on Kubernetes, initially using DNS A‑records for service discovery. As the number of instances grew beyond the UDP limit of ~20 IPs per A‑record, multiple Twemproxy groups and DNS records were introduced.

To overcome Twemproxy's single‑CPU bottleneck, the team added SO_REUSEPORT support, launching multiple Twemproxy processes inside one container that share the same port, letting the OS load‑balance connections. A starter process aggregates stats from each instance and forwards signals.

The official Redis Cluster was not adopted because its synchronous MIGRATE command can block the server, its HA model is limited to master‑slave, built‑in Sentinel adds extra traffic, and slot storage overhead can be significant.

Scaling is handled via static and dynamic methods. Static scaling adjusts maxmemory when free memory is available. Dynamic scaling involves resharding: a custom migration tool proxies data from the old cluster, uses SYNC to receive an RDB dump, converts each key to a RESTORE command, and pipelines them to the new cluster, followed by Canary testing and configuration reload.

For monitoring, a libpcap‑based bypass analysis tool captures traffic and performs protocol analysis, providing a low‑impact alternative to the built‑in MONITOR command and also works with Twemproxy.

Future work includes upgrading most instances to Redis 4.0/5.0 to leverage new commands, modules, and LFU eviction policies.

The Zhihu infrastructure team maintains core components such as containers, Redis, MySQL, Kafka, load balancers, and HBase, and is recruiting engineers interested in high‑availability architecture.

backendhigh availabilityKubernetesRedisscalingTwemproxy
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.