Databases 37 min read

Understanding Alipay’s LDC Architecture, Unitization, and CAP Analysis

The article explains how Alipay achieves massive payment throughput during Double‑11 by using logical data centers (LDC), unit‑based system design, multi‑active disaster‑recovery, and CAP‑theorem analysis, highlighting the role of OceanBase and PAXOS in ensuring consistency and availability.

Top Architect
Top Architect
Top Architect
Understanding Alipay’s LDC Architecture, Unitization, and CAP Analysis

Background

Since 2008 Double‑11 events, Ant Financial has continuously pushed technical limits, increasing payment TPS from 2 万 per minute in 2010 to 25.6 万 per second in 2017, relying on logical data centers (LDC) and unit‑based architecture.

LDC and Unitization

LDC (Logical Data Center) abstracts physical distribution into a logically unified data center, emphasizing overall coordination (availability, partition tolerance) and consistency.

Unitization splits large internet systems into independent units, each serving a specific user segment, allowing linear scaling of TPS by replicating units.

Key point: Sharding solves database bottlenecks caused by I/O limits; unitization is a deployment method that also enhances disaster recovery.

System Architecture Evolution

Early monolithic architecture evolved to distributed micro‑services, then to master‑slave database clusters, and finally to sharding and unitization to overcome write bottlenecks.

Horizontal scaling of application servers solved compute limits, but database became the new bottleneck, leading to sharding (horizontal table partitioning) and vertical partitioning (service separation).

Traffic Redirection and Disaster Recovery

After unitization, multi‑active deployment across regions (e.g., Shanghai and Hangzhou) enables asynchronous multi‑active and disaster‑recovery. Traffic is routed via a custom reverse‑proxy (Spanner) and global load balancer (GLSB) based on user ID mapping.

RZ0* --> a
RZ1* --> b
RZ2* --> c
RZ3* --> d

During a disaster, database mapping is switched first, then traffic mapping, to avoid overwhelming the target unit.

Reason for switching DB mapping before traffic: prevents massive failed requests from hitting the new unit.

CAP Theory Review

CAP states a distributed system can satisfy at most two of Consistency, Availability, and Partition tolerance.

Consistency: all nodes see the same data at the same time.

Availability: every request receives a response.

Partition tolerance: system continues operating despite network partitions.

CAP Analysis of Different Architectures

Single‑instance DB (CP) lacks partition tolerance; master‑slave clusters (AC) provide availability and consistency but not partition tolerance; sharding with unitization (AP) offers high availability and partition tolerance, relying on eventual consistency.

OceanBase CAP Analysis

OceanBase (OB) uses Paxos consensus, requiring writes to be acknowledged by (N/2)+1 nodes, achieving partition tolerance and high availability while providing eventual consistency (AP+C). Thus OB is effectively an AP system with eventual consistency.

Paxos ensures only one value is accepted during partitions, preventing split‑brain scenarios.

Conclusion

The key to Alipay’s massive TPS is:

RZone design based on user sharding.

Paxos‑protected OB writes preventing split‑brain.

CZone local reads for data with write‑read latency.

GZone for truly global shared data.

Combined with operational practices like traffic shaping and multi‑region deployment, these designs enable ultra‑high‑throughput, highly available payment processing.

distributed systemsCAP theoremdisaster recoveryunitizationOceanBasehigh TPSLDC
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.