Databases 14 min read

Design and Implementation of a General Data Synchronization Solution for Online and Offline Databases

This article analyzes the pain points of inconsistent online‑offline configurations, proposes three synchronization schemes—including timed batch sync, automatic real‑time binlog‑MQ sync, and manual selective sync—compares their applicability, risks and performance, and presents practical deployment results across multiple projects using MySQL and TiDB.

转转QA

Apr 27, 2025

Design and Implementation of a General Data Synchronization Solution for Online and Offline Databases

1. Pain Points

1.1 Testing Bottlenecks (QA Perspective)

14:00

While executing a test case, an order was created; the expected main entity was A but the actual main entity was B, causing a mismatch. 14:30 QA reported the issue and discovered that the order main‑entity value depended on the finance‑accounting interface. 15:00 After contacting finance‑accounting QA, the root cause was identified as a missing offline configuration for the business line. 15:30 Finance‑accounting QA added the missing configuration offline, the order succeeded, and the problem was resolved after a 2‑hour effort.

Problems Business testing bottlenecks QA personnel frequently interrupted Manual configuration is time‑consuming

1.2 Configuration Risks (PM Perspective)

The OA approval template consists of approval forms and approval processes. Any addition or modification requires configuring both the online and test environments, leading to potential mismatches when hundreds of forms and processes are involved.

Inconsistent configurations can cause hidden production issues that are hard to detect during testing.

Problems Low daily work efficiency Potential quality issues

1.3 Security (RD Perspective)

Business requires data consistency between online and offline systems, so a daily script syncs the entire online dataset (tens of millions of rows) to offline. This massive sync consumes resources and creates database safety concerns.

Problem Database security issue

1.4 Problem Summary

The common root cause is data inconsistency between the online and offline databases. The goal is to build a universal data‑synchronization capability that automatically syncs online data to offline, improving efficiency and eliminating hidden quality risks.

2. Solution Exploration

2.1 Solution 1: Timed Batch Sync

Approach : Record primary keys in the online DB and periodically sync the accumulated data.

Example: first sync 10<id<=20 , second sync 20<id<=30

Issues

Low applicability – requires regular primary keys; irregular keys cannot be synced accurately.

Low accuracy – offline data may change, causing conflicts.

Risk – requires operations on the online DB.

Limited sync types – only insert operations are supported.

Because of these drawbacks, we investigated a more automatic and accurate method.

Resolution : Use the company’s binlog mq to consume change events and achieve automatic, real‑time synchronization (Solution 2).

2.2 Solution 2: Automatic Real‑Time Sync

Approach : Consume the online DB binlog mq to sync incremental changes in real time.

(1) Parse the MQ message to obtain fields such as before, after, table, and operation type.

// mq content example
{
  "before": null,
  "after": {
    "id": 0,
    "name": "bbb",
    "score": 88
  },
  "source": {
    "db": "cdc",
    "table": "t1"
  },
  "op": "c"
}

(2) Retrieve offline DB schema information.

// Get primary key
SELECT cu.Column_Name FROM INFORMATION_SCHEMA.`KEY_COLUMN_USAGE` cu WHERE CONSTRAINT_NAME = 'PRIMARY' AND cu.Table_Name = {tableName} AND CONSTRAINT_SCHEMA={dbName};

// Get auto‑increment key
SELECT COLUMN_NAME FROM information_schema.columns WHERE TABLE_SCHEMA= {dbName} AND TABLE_NAME = {tableName} AND extra = 'auto_increment';

// Get unique indexes
SHOW INDEX FROM {tableName} WHERE Non_unique=0 AND Key_name != 'PRIMARY';

// Get column definitions
SHOW COLUMNS FROM {tableName};

(3) Pre‑validation: only sync when appropriate (e.g., insert when key does not exist, update when record exists, delete when record exists). (4) Assemble SQL with fine‑grained handling for auto‑increment keys, JSON fields, type distinctions, and escape characters. (5) Execute the generated SQL to complete synchronization.

2.3 Solution 3: Manual Selective Sync

Approach : An upgraded version of Solution 1 that allows users to specify exact data ranges for manual execution. Example condition: id IN (100,101,102) or id<10 . Enhancements :

No reliance on primary‑key patterns; users manually define the data scope.

Use replace into for strong consistency and conflict resolution.

3. Implementation

In practice, we combined Solution 2 (automatic) and Solution 3 (manual) to cover different scenarios. Both support TiDB and MySQL without code changes—just one‑click configuration. Automatic Sync (Solution 2) Manual Sync (Solution 3) Applicable Scenario Large‑scale incremental sync Small‑scale batch sync Risk Zero online interaction, no risk Requires online interaction, usage constraints Efficiency Real‑time On‑demand Typical Use Cases Payment account configuration real‑time sync OA approval template sync

3.1 Automatic Sync Features

Simple configuration to start.

Supports dynamic registration of rocket mq and kafka mq consumers.

Provides filtering by operation type, strong consistency for auto‑increment keys, and overall data consistency.

Backend UI

3.2 Manual Sync Features

Specify a single record to sync, e.g., id=1.

Specify a range, e.g., id<10.

Backend UI

4. Real‑World Application

Daily sync volume reaches 500k‑1.5M rows with a failure rate < 0.00000001 . Automatic Sync Manual Sync Number of Integrated Databases 9 databases, 88 tables 14 databases, 74 tables Integrated Business Orders, Payments, Finance, Core Services, Merchants, etc. Orders, Payments, Finance, OA, After‑sale, Promotions, etc.

4.1 Project Deployments

Project Sync Data Volume Manual Config Time Chosen Scheme Tool Sync Time Efficiency Gain Project A 14 tables migrated 8 h (480 min) Manual Sync 10 min ↑98 % Project B +20 configurations 2 h (120 min) Automatic Sync 0 ↑100 % Project C +70 configurations 6 h (480 min) Automatic Sync 0 ↑100 % Project D +20 configurations 2 h (120 min) Automatic Sync 0 ↑100 % Project E +50 templates 25 h (1500 min) Manual Sync 30 min ↑98 %

4.2 Daily Configuration Auto‑Sync

Saving 1‑2 hours per configuration issue; payment QA reported a 20 % reduction in weekly interruptions.

4.3 OA Approval Template Sync

One‑click sync saved 50 % of product workload and 100 % of testing effort, eliminating duplicate environment configurations.

4.4 Product Data Automatic Sync

After enabling automatic sync, real‑time incremental sync resolved the database‑resource‑exhaustion problem caused by massive data refreshes.

Author: Lin Wenwen – ZhiZhuan Test Development Engineer Feel free to comment, exchange ideas, and learn together.

For more ZhiZhuan business practices, follow the public account below.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Database MySQL TiDB binlog data synchronization real-time-sync

Written by

转转QA

In the era of knowledge sharing, discover 转转QA from a new perspective.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.