Databases 23 min read

High Availability and Disaster Recovery Strategies for OceanBase Distributed Database

This article reviews traditional database high‑availability techniques, explains the advantages of distributed multi‑replica consistency (Paxos/Raft) used by OceanBase, and compares various deployment topologies—from single‑site three‑replica to multi‑city five‑replica designs—highlighting their trade‑offs and best‑practice recommendations.

AntTech
AntTech
AntTech
High Availability and Disaster Recovery Strategies for OceanBase Distributed Database

OceanBase, Ant Financial's financial‑grade distributed relational database, achieves high availability (HA) and disaster recovery (DR) through multi‑replica consistency protocols such as Paxos or Raft, providing RPO=0, low RTO, and automatic failover.

The article first revisits conventional HA solutions used by traditional databases (Oracle Data Guard, DB2 HADR, master‑slave replication, hardware‑level HA, storage replication, CDC/GoldenGate), noting their reliance on primary‑secondary models, limited fault‑tolerance, and challenges such as non‑zero RPO and long RTO.

It then introduces distributed multi‑replica consistency, which replicates data to a majority of nodes, enabling automatic leader election, zero data loss, and rapid recovery, while also supporting zone‑level and city‑level disaster protection.

Several practical deployment patterns are described:

Single‑data‑center 3‑replica (zone‑level HA only, no DR).

Same‑city 3‑data‑center 3‑replica (adds data‑center‑level DR).

Three‑city 5‑replica (provides both data‑center‑ and city‑level DR, highest HA).

Same‑city 2‑data‑center 3‑replica (partial HA, limited DR).

Two‑city 3‑data‑center 5‑replica (HA with data‑center DR, but no city‑level DR).

Cluster‑to‑cluster data replication (fallback when infrastructure cannot meet multi‑replica requirements, but incurs RPO>0 and higher RTO).

Each topology is evaluated for hardware requirements, network latency constraints, cost, and suitability for different business SLAs. The article recommends the simplest topology that meets the required HA/DR level, suggesting log‑replica technology to reduce cost when adding a third data center.

Finally, the article emphasizes continuous evolution of technology and encourages community engagement for further improvements in OceanBase HA/DR solutions.

High AvailabilityDistributed Databasedisaster recoveryReplicationPaxosOceanBase
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.