Design and Implementation of eBay's Next‑Generation Million‑TPS Core Accounting System
The article details eBay's 2018‑2020 design, performance testing, and fault‑tolerance architecture of a next‑generation core accounting system capable of handling millions of transactions per second, covering system goals, multi‑region deployment, event‑sourcing, Raft consensus, scalability optimizations, and the planned open‑source release.
1. Introduction In 2018 eBay launched a next‑generation payment accounting system aiming to support ultra‑high TPS workloads, improve disaster‑recovery capabilities, and provide a reference for the industry.
2. System Goals Business goals include full payment functionality and extensible interfaces; disaster‑recovery targets a three‑region, five‑center deployment with millisecond‑level data sync; performance goals demand automatic linear scaling and the ability to handle global peak traffic; data goals require a real‑time data hub and blockchain‑style tamper‑proofing.
3. Million‑TPS Stress Test The system was evaluated through single‑node, three‑node (two‑region), and single‑center configurations, processing up to ~10,000 transactions per second per node. Results showed 7,800 TPS for two‑region three‑center, 9,500 TPS for same‑region three‑center, and up to 11,000 TPS for single‑node setups, with occasional Raft leader elections causing brief TPS drops.
4. Architecture Analysis The architecture follows a four‑layer design (business, core accounting, infrastructure, storage) plus a monitoring layer. It employs event sourcing, CQRS, and a custom Raft‑based strong‑consistency algorithm, achieving three‑region five‑center fault tolerance where any two node failures are tolerated.
5. Event Sourcing & CQRS Commands are transformed into events stored in an event store; a state machine replays events to maintain state. CQRS separates write (command/event) and read paths, allowing real‑time aggregation via Kafka and relational databases.
6. Implementation Details Business layer is stateless Java; core and infrastructure layers are stateful C++17 with functional‑programming techniques; auxiliary components are written in Go. Optimizations focus on throughput (pipeline, batch processing) and latency (network tuning, single‑pass Raft sync).
7. Deployment The system runs on Docker/Kubernetes cloud platforms with one‑click deployment, supporting both distributed and single‑machine modes, and can dynamically switch storage back‑ends via a distributed configuration service.
8. Open‑Source Plan The project will be opened in three phases: first to academic institutions, then to commercial partners, and finally to the public by the end of the following year.
9. Conclusion The design demonstrates how to achieve million‑TPS processing, strong consistency, and high availability for payment accounting, and outlines future research and open‑source contributions.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.