Databases 10 min read

How Pika’s Native Distributed Cluster Overcomes Redis Capacity Limits

This article explains Pika’s native distributed cluster architecture, detailing its deployment structure, data distribution with tables and slots, request processing flow, both non‑consistent and Raft‑based log replication, and the enhanced metadata management that enables scalable, high‑availability storage beyond single‑node Redis limitations.

360 Zhihui Cloud Developer

Dec 2, 2020

How Pika’s Native Distributed Cluster Overcomes Redis Capacity Limits

Background

Pika is a persistent, large‑capacity Redis‑compatible storage service that supports most string, hash, list, zset, and set interfaces, addressing the memory bottleneck of Redis when handling massive data volumes. To meet growing demand for distributed clusters, the native Pika cluster (v3.4) was released.

Cluster Deployment Structure

The example shows a three‑node Pika cluster.

Deploy an Etcd cluster to store Pika manager metadata.

Install Pika manager on three physical machines; each registers with Etcd and competes to become the leader, with only one leader writing cluster data.

Deploy Pika nodes on the three machines and add their information to the Pika manager.

Data Distribution

Pika introduces the concept of tables to isolate business data; keys are hashed to slots, each slot having multiple replicas forming a replication group. One replica acts as leader, handling read/write operations, while followers replicate data. The manager can schedule slot migration for balanced load and horizontal scaling.

Pika uses RocksDB as the storage engine; each slot creates a RocksDB instance supporting all Redis data structures. However, creating many slots can lead to excessive RocksDB instances and resource consumption, which future versions aim to mitigate.

Data Processing

The parsing layer interprets the Redis protocol and passes the result to the router.

The router finds the slot for the key and checks if it resides locally.

If the slot is remote, a task is created and the request is forwarded to the peer node; the response is returned after processing.

If the slot is local, the request is processed directly.

Write requests generate a binlog via the replication manager, which asynchronously replicates to other slot replicas; the leader slot writes to the database, using Blackwidow as the RocksDB interface.

Clients interact with the cluster without needing to be aware of an external proxy, and Pika service ports can be load‑balanced through LVS.

Log Replication

Non‑Consistent Log Replication

In this mode, the processing thread writes the binlog and updates the database immediately, then returns the response to the client. An auxiliary thread sends a BinlogSync request to follower slots, which acknowledge with BinlogSyncAck.

Processing thread receives client request, locks, writes binlog, and updates DB.

Thread returns response to client.

Auxiliary thread sends BinlogSync to follower.

Follower returns BinlogSyncAck.

Consistent (Raft) Log Replication

Here, the processing thread writes the binlog and sends a BinlogSync request to followers. The request is committed only after a majority of followers acknowledge, ensuring consistency before writing to the database.

Processing thread writes request to binlog file.

BinlogSync request is sent to followers.

Followers return BinlogSyncAck.

After receiving majority acknowledgments, the request is written to the DB.

Response is returned to the client.

Cluster Metadata Handling

Based on a customized Codis‑dashboard, the Pika Manager (PM) serves as the global control node, storing cluster metadata and routing information.

Supports creating multiple tables for business data isolation.

Allows specifying slot count and replica count per table.

Transforms group concepts to replication groups at the slot level.

Enables table‑level password authentication.

Facilitates slot migration for scaling.

Integrates a sentinel module that monitors node health and promotes the most up‑to‑date follower to leader when needed.

Metadata is persisted in Etcd for high availability.

PM achieves high availability by competing for locks in Etcd.

Afterword

The native Pika cluster removes the single‑node disk capacity limitation, allowing horizontal scaling according to business needs. Remaining issues include the lack of an internal Raft‑based leader election, range‑based data distribution, and monitoring dashboards, which will be addressed in future releases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed database replication Pika Cluster Architecture

Written by

360 Zhihui Cloud Developer

360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.