Backend Development 22 min read

ShopeePay Multi-Active Data Center Architecture and Unitization Strategy

ShopeePay replaces its traditional active‑standby design with a unit‑based multi‑active data‑center architecture—splitting user, public, and order data across independent units, employing routing tables, caches, and a dual‑active model that enables seamless failover, reduced latency, and continuous payment processing across two or more IDC sites.

Shopee Tech Team
Shopee Tech Team
Shopee Tech Team
ShopeePay Multi-Active Data Center Architecture and Unitization Strategy
目录
1. 背景
2. ShopeePay 原始架构
3. 不同机房架构下的单元化方式
4. 基于数据复制的异地多活不可能定律
5. ShopeePay 数据模型分析
6. 实际实施的双活模型
7. 对业务场景的影响
8. 其他复杂问题

1. Background

ShopeePay is SeaMoney's digital wallet service that provides online and offline merchants with payment and settlement capabilities, and offers consumers payment and transfer services. As a financial‑grade service, ShopeePay requires higher business continuity and data safety than ordinary internet services, and data‑center availability is the most critical factor.

Traditional Active‑Standby data‑center designs cause service interruption during disaster failover and cannot meet ShopeePay's future growth requirements; multi‑active data‑centers are therefore mandatory.

Achieving data‑center multi‑active is one of the hardest architectural challenges. Payment is a classic iceberg model: a simple front‑end payment action hides a complex back‑end workflow that includes acquiring, payment, settlement, marketing, fraud‑control, account balance operations, and reconciliation.

Multi‑active not only requires splitting core data across multiple data‑centers, but also ensuring that critical payment scenarios can run across them. Both infrastructure components and business logic must be adapted, involving extensive cross‑team collaboration.

2. ShopeePay Original Architecture

Gateway: traffic entry point, forwards requests to downstream micro‑services.

Business Logic: micro‑service layer for business processing.

DAL: data‑access layer, exposes MySQL native protocol upstream and forwards SQL statements downstream.

In the diagram, the data layer uses an Active‑Standby write model. Each DB cluster is a primary‑with‑multiple‑replicas setup, and the two clusters replicate via MySQL native binlog. Switching requires manual DB master‑standby promotion and reserved spare resources.

The prerequisite “IDC0 is down and cannot be recovered quickly” often takes at least 20 minutes to confirm, plus several minutes for the actual DB switch. If replication is asynchronous, additional time is needed to lock abnormal accounts via remote transaction logs. Overall, a failover can take tens of minutes to hours, making it impractical for most non‑catastrophic events.

Therefore, the long‑term evolution direction is to “unit‑ize” – split data into independent units so that a failure of any single unit only impacts a local subset of traffic, reducing both risk and cost.

Unit‑ization is the fundamental solution for data‑center multi‑active.

3. Unit‑ization Patterns for Different IDC Topologies

3.1 Two‑IDC Model

Each IDC hosts the other's slave DB cluster. In a failure, the backend DB switches and the frontend redirects traffic. Resource utilization is about 50 % (half reserved for the peer).

3.2 Three‑IDC Model

Each IDC hosts a slave cluster of a neighboring IDC. Failure handling is similar, but only ~30 % resource redundancy is needed.

3.3 Four‑IDC Model

1) Ring

2) Pairwise Mirror

3) One‑to‑Two

Advantages: tolerates simultaneous failure of two IDCs.

Disadvantages: cross‑IDC synchronization becomes more complex; asynchronous replication requires managing two IDC distances and freeze operations; semi‑synchronous needs majority‑based sync strategies.

4. The “Impossible Law” of Geo‑Distributed Multi‑Active Based on Data Replication

Payment scenarios inevitably involve cross‑database transactions (e.g., transfers). Excessive DB latency degrades user experience.

Data‑center disaster recovery also suffers from latency limits—whether using primary‑standby or majority‑based replication, long‑distance fiber latency is a physical constraint. Defining “geo‑distributed” as > 1000 km, the theoretical RTT is ~6.6 ms, while intra‑city RTT can be ~40 ms, an order of magnitude higher.

Consequently, ShopeePay currently focuses on intra‑city (same‑city) dual‑active solutions.

5. ShopeePay Data Model Analysis

Data can be classified into the following categories, each with a specific handling method:

Data Category

Description

Handling

User Dimension

User account balances

Unit‑split by user ID

Public Information

Merchant basic info (read‑heavy, write‑light)

1‑write‑N‑read pattern

Order Dimension

May be sharded by merchant order number or platform order number

Unit‑split by order ID (unit ID encoded in order ID)

Other Undistinguishable Dimensions

Same as public information, no unit‑ization

Each category follows a fixed splitting rule.

5.1 User‑Dimension Data

If a unit becomes unavailable, only half of the users are affected, and no manual intervention is required. Migrating half of the users from one unit to another is risky and complex.

ShopeePay adds a table‑level routing layer on top of the user‑dimension routing table, enabling gradual, gray‑scale data migration and reducing risk.

Two routing approaches were considered: Single‑user primary‑key routing – simple but new users default to unit0, requiring periodic re‑balancing. Hash/modulo‑based routing – complex, hard to maintain during migrations. The final model uses a straightforward user‑ID routing table (illustrated below).

5.2 Public‑Information Data

Writes occur only in the primary unit (unit0) and are asynchronously replicated to all other units for local reads (e.g., merchant categories, settlement banks, fee rates).

5.3 Order‑Dimension Data

Any unit can accept an order, but the unit ID must be encoded into the order ID to identify ownership.

To guarantee high availability, a “jump‑order” logic is introduced: if unit0 fails to place an order, a new order ID is generated and the request is retried on unit1, and vice‑versa.

Only one surviving unit is needed to complete the order; subsequent payment steps are independent.

Duplicate Orders After Unit‑ization

Because merchants may reuse the same merchant order number across units, and the platform lacks a global idempotent mechanism, duplicate orders can occur during failures, retries, or traffic shifts.

Two mitigation strategies are discussed: Store a hash of (merchant ID + merchant order ID) in the wallet service to achieve idempotency for fund‑related orders. Deploy a Global Cache component to share order‑mapping data between units, supplemented by reconciliation mechanisms to handle rare cache inconsistencies.

6. Implemented Dual‑Active Model

Unit Route DB

Stores the mapping from user ID to unit address. A Route Cache layer sits on top of the DB to protect it; DAL and Gateway also maintain local caches.

Gateway

Provides external APIs for unit address lookup and error‑traffic redirection. Ultimately, DAL can fallback to correct routing.

Migration Tool

Handles safe migration of 50 % of users to unit1, including data locking, route DB updates, cache refresh, and rollback strategies.

DAL (Data Access Layer)

Parses SQL to extract user ID, queries the Route Cache for the target unit, and forwards the request to the appropriate DB, abstracting unit distribution from business modules.

Global Cache

Shares data between units, primarily for: Storing login tokens usable across units. Caching merchant‑order‑to‑payment‑order mappings for global idempotency (with fallback reconciliation).

External Merchant Usage

Merchants can send orders to any available unit, but directing orders to the unit where the user resides improves performance and resilience.

Remote Transaction Log

Because MySQL replication remains asynchronous, a remote transaction log is deployed in each peer IDC to lock potentially abnormal accounts before a switch, ensuring data safety without relying on full replication catch‑up.

7. Impact on Business Scenarios

Not all scenarios can be fully supported in a multi‑active setup; core scenarios such as Payment, Scan, Pay, and Top‑up are prioritized.

Other services (e.g., settlement, marketing, user order list queries) can tolerate temporary unavailability and be restored after failover.

Anti‑Fraud: Must support automatic degradation during IDC failures.

Data: Switch from single‑point subscription to multi‑point subscription with data merging.

User Order List: Requires data merging across units or a separate view built independently of raw order tables.

8. Other Complex Issues

The DAL component is the most intricate, needing to resolve: Physical vs. logical table mapping across multiple MySQL instances. Different access strategies per table (user‑unit routing, order‑unit routing, round‑robin, default unit0). Service‑port to master/slave mapping to ensure reads of public data hit local slaves. Support for MySQL transactions within a unit while eliminating cross‑unit transactions (adopting TCC). Dynamic addition of business tables to support ongoing feature growth.

Additional challenges include external URL callbacks, cache warm‑up across units, duplicate‑order fallback mechanisms, and cross‑unit data validation.

ShopeePay's multi‑active journey is underway; all components have been developed and migration work is progressing.

backend engineeringdatabase replicationdata centerunitizationmulti-active architectureShopeePay
Shopee Tech Team
Written by

Shopee Tech Team

How to innovate and solve technical challenges in diverse, complex overseas scenarios? The Shopee Tech Team will explore cutting‑edge technology concepts and applications with you.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.