WPaxos: A Production‑Grade Java Implementation of Multi‑Paxos for Distributed Consensus
WPaxos is an open‑source Java implementation of the Multi‑Paxos distributed consensus algorithm that provides high performance, strong consistency, fault tolerance, and extensibility for data‑intensive systems, and includes detailed architecture, feature descriptions, performance benchmarks, and future development plans.
WPaxos is an open‑source Java implementation of the Paxos distributed consensus algorithm released by 58.com, targeting high‑performance, high‑reliability and high‑scalability in data‑intensive systems.
Key Features
High performance: combines Multi‑Paxos and Basic‑Paxos, supports multiple Paxos groups for ordered value determination.
Fast data alignment via checkpoint or per‑record streaming.
Network partition tolerance and high availability with minority node failures.
Master automatic election.
Dynamic, secure node addition/removal via Paxos protocol.
Highly extensible storage and asynchronous communication modules.
Multiple state machines per Paxos instance.
Incremental checksum verification for submitted data.
Support for follower nodes used only for backup.
Configurable log cleanup by hold count, hold time, or disk usage.
Project Background
To address high‑performance, high‑reliability and high‑scalability challenges in data‑intensive systems, replication and partitioning are often used, but asynchronous communication can cause network failures, machine crashes, clock drift, and data loss or duplication. Achieving strong consistency across distributed nodes is a difficult industry problem, and consensus algorithms like Paxos provide a principled solution.
Paxos Basics
The original Paxos algorithm, introduced by Leslie Lamport, consists of Proposer, Acceptor, and Learner roles. Proposers interact with a majority of Acceptors in two phases (prepare and accept) to achieve consensus. Learners synchronize lagging data, and once a value is chosen it cannot be altered.
WPaxos Improvements
WPaxos builds on the production‑grade PhxPaxos library (C++) and introduces several optimizations:
Leader election with optional master‑driven proposals.
Skipping the prepare phase after a successful proposal when no timeout or conflict occurs, reducing network RTT and disk writes.
Multi‑Group‑Paxos supporting parallel value determination across groups.
Asynchronous batch merging of concurrent requests for high throughput.
Customizable state machine, storage, and asynchronous communication modules.
Flexible follower nodes that do not participate in voting.
Architecture
The core Multi‑Paxos algorithm follows the PhxPaxos design. Each Node runs multiple Paxos groups, each group hosts a Paxos instance that orders instance IDs. The storage module uses a physical log file plus an IndexDB (LevelDB by default) to persist accepted instances, enabling replay and recovery.
Performance Testing
Test environment: 3 nodes, each with 20 × Intel Xeon Silver 4114 CPUs, 192 GB RAM, SSD storage, 10 GbE network.
Two scenarios were evaluated: non‑batch proposals and batch proposals, with data sizes of 100 B and 2 KB. Results show that batch merging significantly improves QPS, and that file‑based indexing outperforms LevelDB when the number of groups is small.
Future Plans
Master load‑balancing strategy across nodes to improve throughput and stability.
Linearizable read strategy ensuring the latest data from master or slave.
Contribution & Feedback
Developers are invited to review the source code at https://github.com/wuba/WPaxos and submit issues or pull requests for suggestions and bug reports.
Authors
Li Dan – Backend architect at 58.com, responsible for distributed message queue and lock services.
Li Yan – Senior backend engineer at 58.com, focusing on distributed messaging and lock services.
References
PhxPaxos source: https://github.com/Tencent/phxpaxos
PhxPaxos wiki: https://github.com/Tencent/phxpaxos/wiki
Multi‑Paxos and Leader article: https://zhuanlan.zhihu.com/p/21466932
State machine checkpoint details: https://github.com/Tencent/phxpaxos/wiki/%E7%8A%B6%E6%80%81%E6%9C%BACheckpoint%E8%AF%A6%E8%A7%A3
58 Tech
Official tech channel of 58, a platform for tech innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.