Databases 16 min read

Design and Implementation of Ctrip's Data Replicate Center (DRC) for MySQL Multi‑Active Replication

The article describes Ctrip's Data Replicate Center (DRC), a MySQL middleware that enables real‑time bidirectional replication across data‑center clusters, detailing its architecture, low‑latency optimizations, consistency mechanisms, DDL handling, monitoring, and future high‑availability improvements.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Design and Implementation of Ctrip's Data Replicate Center (DRC) for MySQL Multi‑Active Replication

Ctrip operates MySQL clusters across multiple data centers, with a primary‑replica setup in one site and a standby replica in another for disaster recovery. To achieve true multi‑active, cross‑region reads and writes without manual DR switching, a real‑time bidirectional replication component called DRC (Data Replicate Center) is introduced.

DRC is a database middleware developed by Ctrip's framework team to support bidirectional or multi‑directional data replication, providing global deployment capabilities under the G2 strategy. Its architecture consists of four main modules: Replicator Container (manages Replicator instances that pull binlogs and store them locally), Applier Container (applies stored binlogs to target MySQL instances), Cluster Manager (handles high‑availability failover and instance coordination), and Console (UI, external APIs, and monitoring).

The detailed design enforces several DB access standards: MySQL version ≥5.7.22, master‑side Writeset enabled for parallel replication, GTID enabled for precise binlog positioning, each table must contain a millisecond‑precision timestamp column, and every table must have a primary or unique key. These requirements ensure low replication latency and strong data consistency.

DRC reduces latency through three layers of optimization: the network layer uses asynchronous I/O via the open‑source X‑Pipe component; the system layer leverages zero‑copy, page‑cache writes, and off‑heap memory to filter and persist only necessary events; the application layer adopts a water‑level‑based parallel algorithm for applying SQL statements. Additional mechanisms such as idle‑connection detection and dynamic traffic control further improve resilience.

Data consistency is guaranteed by preserving event ordering, implementing an “at‑least‑once” delivery model, preventing loop replication through GTID‑based filtering, and relying on MySQL's idempotent execution of already applied GTIDs. Conflict resolution is handled by timestamp‑based last‑write‑wins policies and optional manual review of conflicting SQL.

For DDL support, DRC stores table‑structure snapshots and DDL events directly in custom binlog events, eliminating the need for an external metadata store. An embedded lightweight database reconstructs historical schemas on the fly, enabling correct parsing of binlog events after structural changes.

Comprehensive monitoring covers replication delay, consistency checks, traffic and TPS, BU/application/IDC dimensions, DDL changes, table‑structure consistency, conflict occurrences, and GTID set gaps, providing early alerts for any anomalies.

In summary, DRC achieves sub‑second replication delay and strong consistency for Ctrip's multi‑active MySQL deployments, with ongoing work focused on higher availability, overseas support, and infrastructure enhancements to support the company's global strategy.

High AvailabilitymysqlDatabase Middlewaredata replicationGTIDDRC
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.