Databases 18 min read

OceanBase: Distributed Architecture, High‑Performance Storage Engine, Paxos‑Based 2PC, and Record‑Breaking TPC‑C Benchmarks

The article reviews OceanBase's distributed relational database design, its integrated architecture, high‑compression LSM‑tree storage engine, Paxos‑enhanced two‑phase commit protocol, and how these innovations enabled the system to set successive world records in the TPC‑C benchmark, illustrating China's growing database capabilities.

AntTech
AntTech
AntTech
OceanBase: Distributed Architecture, High‑Performance Storage Engine, Paxos‑Based 2PC, and Record‑Breaking TPC‑C Benchmarks

In recent years, China's database ecosystem has expanded rapidly, introducing terms such as cloud‑native, sharding, and hybrid workloads. Amid this growth, the technical strength of individual databases remains a key question.

OceanBase's research paper "OceanBase: A 707 Million tpmC Distributed Relational Database System" was accepted by VLDB 2022, one of the three top international database conferences, showcasing cutting‑edge industrial research.

The paper details OceanBase's design goals, standards, infrastructure, and core components, and explains how a distributed cluster of over 1,500 servers across three regions achieved the highest global TPC‑C benchmark scores.

VLDB reviewers praised OceanBase as a large‑scale distributed relational database that delivers unprecedented OLTP performance and scalability.

OceanBase follows a Share‑Nothing architecture, providing cross‑region fault tolerance. Requests flow from the application layer to the OBProxy, then to OBServer nodes where SQL execution occurs, with results returning along the reverse path.

Each node hosts its own SQL, transaction, and storage engines, forming an equal‑peer cluster divided into zones; data is replicated three‑fold across zones, creating a logically unified database with high availability.

The system emphasizes RDBMS compatibility, reducing migration and learning costs for users accustomed to MySQL or Oracle.

Six key features are highlighted: high performance via read‑write separation and near‑memory speeds; low cost through PC servers and high compression; high availability via multi‑replica storage and "three‑region‑five‑center" deployment; strong consistency using Paxos; linear scalability with peer‑to‑peer nodes; and MySQL‑compatible front‑end protocols.

Two core technologies enable these features: a distributed high‑compression storage engine based on an LSM‑tree (similar to Google Bigtable) and a distributed transaction engine that combines Paxos with a novel two‑phase commit (OceanBase 2PC) to provide automatic fault tolerance.

The storage engine employs asymmetric read/write blocks, daily incremental compression, and column‑wise encoding to achieve near‑memory performance and significant storage savings.

The transaction engine integrates Paxos into 2PC, allowing participants to recover from failures without blocking, achieving RPO = 0 and making OceanBase suitable for financial core systems.

Optimizations reduce coordinator state persistence and shorten transaction latency, cutting Paxos synchronizations from three to two.

OceanBase has set two TPC‑C world records: 60.88 million tpmC in October 2019 and over 707 million tpmC in June 2020, demonstrating exceptional scalability and stability.

The benchmark involved a cluster of up to 1,554 servers, 2360 Alibaba Cloud ECS instances, and 400 remote terminal emulators simulating over 559 million users, with careful handling of data generation, replication, and background compression to meet strict performance variance limits.

Performance results show linear growth of tpmC with node count, low response times across all TPC‑C transaction types (New‑Order, Payment, Order‑Status, Delivery, Stock‑Level), and minimal latency variance.

Looking forward, OceanBase aims to become a globally competitive, open‑source database comparable to MySQL and PostgreSQL, continuing to innovate in stability, partitioning, and distributed system best practices.

LSM TreeDistributed DatabaseDatabase PerformanceVLDBPaxosOceanBaseTPC-C Benchmark
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.