Achieving RPO=0: Lightweight Binlog Server Boosts MySQL Replication to Zero Data Loss
To prevent costly data loss during database failures, the XiaoHongShu team designed a resource‑efficient Binlog Server that doubles half‑sync replication speed to over 300 MB/s, enables automatic data recovery without manual intervention, and ensures RPO = 0 across fully deployed MySQL clusters.
Problem Background
MySQL semi‑synchronous replication often becomes a bottleneck: under heavy write load the replication speed drops, the master may fall back to asynchronous mode, and a short outage can cause data loss (minutes) affecting thousands of users. Achieving RPO=0 —no data loss after a failover—requires a mechanism that can keep up with the master’s write rate and automatically supply missing binlog events to a newly promoted primary without changing the existing MySQL topology.
Solution Overview
A lightweight Binlog Server (1 CPU + 1 GB RAM) is placed between the master and its downstream slaves in a cascade configuration. It receives binlog events from the master, stores them locally using the same index‑file + data‑file layout as MySQL, and forwards the events to downstream replicas. The server implements the MySQL client protocol, so existing replication commands (e.g., START SLAVE, STOP SLAVE) work unchanged. When the master fails, the ORC high‑availability component selects a new master and the Binlog Server automatically “补” (replays) missing binlog events to the new primary, guaranteeing data consistency.
Key Performance Figures
Replication throughput > 300 MB/s on a 1C1G instance, roughly double the speed of native semi‑sync replication.
CPU and memory usage remain low, enabling deployment in a different availability zone.
Benchmarks with write‑heavy workloads show the Binlog Server can sustain the master’s peak write rate, ensuring no backlog during failover.
Architecture Details
Data Flow
The Binlog Server sits in a cascade topology: Master → BinlogServer → Slave(s). It receives raw binlog events, writes them to local files, and streams them downstream. Control traffic (admin commands) travels on a separate channel.
MySQL Protocol Support
Authentication and status queries for both master‑to‑BinlogServer and slave‑to‑BinlogServer connections.
Admin session handling (authentication, COM commands, ResultSet delivery).
Full support for COM_QUERY, COM_REGISTER_SLAVE, COM_BINLOG_DUMP, etc., matching the MySQL slave protocol.
SQL Parser
A hand‑written lightweight parser handles a limited set of administrative statements required by ORC (e.g., START SLAVE, STOP SLAVE, node registration commands). The parser processes a few thousand statements per second—far below the replication throughput requirement—while keeping the codebase small and dependency‑free.
Bidirectional Node Registration
The server can register as either a master or a slave, enabling true cascade deployments. Registration follows the MySQL handshake and COM_REGISTER_SLAVE sequence, allowing the Binlog Server to receive binlog data upstream and serve it downstream.
Semi‑Sync Header Handling
Each binlog event is prefixed with a two‑byte header: 0xEF (magic number) and an ACK flag (0x01 = requires acknowledgment, 0x00 = no ACK). The Binlog Server inspects the flag, forwards an ACK to the master when needed, and only then allows the master to commit the transaction, preserving semi‑sync semantics.
File Management & Crash‑Safe Consistency
Binlog files are stored as .index (metadata) and .data (raw events), mirroring MySQL’s layout. Updates follow a two‑step atomic procedure:
Write a temporary index file.
Write the data file and atomically replace the original index with the temporary one.
This guarantees that after an unexpected crash the index and data remain consistent.
High‑Availability Integration (ORC)
When the primary master fails, ORC selects a new master in two stages:
1M stage : Choose the node with the longest GTID (could be a Binlog Server) and start data补.
2M stage : Prefer a slave in the same availability zone as the failed master; Binlog Servers are excluded to keep the final topology identical to the pre‑failure state.
Deployment and Validation
In production, each MySQL cluster runs a dedicated Binlog Server in a different data center (same city, different AZ) connected via semi‑sync. This layout doubles the effective replication speed and ensures that, even if the master’s AZ is lost, the Binlog Server can supply the missing events to the newly elected master, achieving RPO=0 .
Performance tests on a 1C1G instance show sustained write rates of 300 MB/s with negligible CPU usage, confirming that the Binlog Server can keep pace with the master under heavy load.
Future Extensions
Use the Binlog Server as a data source for downstream services such as DTS, Canal, or custom CDC pipelines, offloading the primary.
Store binlog files on object storage (e.g., S3 via S3FS) to reduce local storage costs and retain logs for longer periods.
Leverage the server for shard expansion or table‑splitting scenarios where additional binlog replay is required.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
