Databases 16 min read

Using TiDB Data Migration (DM) for MySQL‑to‑TiDB Sync: Architecture, Features, Tuning and Troubleshooting

This article shares practical experience with TiDB Data Migration (DM), covering its background, architecture, key features, online DDL support, common error handling such as duplicate‑key issues, large‑scale import tuning, configuration limits, and cleanup recommendations for reliable MySQL‑to‑TiDB synchronization.

360 Smart Cloud
360 Smart Cloud
360 Smart Cloud
Using TiDB Data Migration (DM) for MySQL‑to‑TiDB Sync: Architecture, Features, Tuning and Troubleshooting

In the early days of synchronizing MySQL to TiDB, we relied on mydumper+loader for full backups and syncer for incremental binlog replication, which required many configuration files and complex setup.

PingCAP later released the TiDB Data Migration (DM) suite, a unified platform that simplifies full‑load and incremental data migration from MySQL or MariaDB to TiDB, reduces operational overhead, and provides a graphical dm‑portal for task creation (now discontinued).

Having used DM since its internal testing version up to the latest 1.0.6, I found it essential for DBAs because most TiDB deployments involve migrating existing MySQL schemas, performing performance comparisons, and then loading data.

Architecture

DM consists of three core components: DM‑master (task management), DM‑worker (execution), and dmctl (command‑line control). The following diagram illustrates the architecture:

Key Features

Table routing and merge migration

Whitelist/blacklist for tables

Binlog event filtering

Shard support for merging tables

New Feature in 1.0.5 – Online DDL Support

DM now supports online schema changes via tools like pt‑online‑schema‑change and gh‑ost . Previously, DDL on temporary tables was skipped, causing downstream TiDB to miss new columns and raise errors.

Example of a failed DDL without online‑DDL support:

skip event because not in whitelist
RENAME TABLE `h_2`.`helei5` TO `h_2`.`_helei5_old`

After enabling online DDL (parameter online‑ddl‑scheme: "pt" ), the new column is correctly replicated downstream.

Sample command to skip a problematic binlog position:

sql‑skip --worker=192.168.1.248:8262 --binlog‑pos=4369‑binlog|000001.000021:62765733 task_4369

Task status query before and after skipping:

{
  "taskName": "task_4369",
  "taskStatus": "Running",
  "workers": ["192.168.1.248:8262"]
}

When a duplicate‑key error (Error 1062) occurs, the log shows:

{
  "msg": "[code=10006:class=database:scope=not-set:level=high] execute statement failed: commit: Error 1062: Duplicate entry ... for key 'clientid'",
  "taskStatus": "Error - Some error occurred in subtask. Please run `query‑status task_4369` to get more details."
}

Resolution involved adding the missing column on the downstream and using replace into to avoid duplicate‑key conflicts.

Large‑Batch Import Tuning

During massive imports, cluster latency spikes. Adjusting the following parameters helped mitigate the issue (values are examples; tune per cluster):

raftstore:
  apply-pool-size: 3-4
  store-pool-size: 3-4

storage:
  scheduler-worker-pool-size: 4-6

server:
  grpc-concurrency: 4-6

rocksdb:
  max-background-jobs: 8-10
  max-sub-compactions: 1-2

Additionally, configure DM‑worker cleanup:

[purge]
interval = 3600
expires = 7
remain-space = 15

Note that relay‑log expiration defaults to never delete; set expires to retain logs for a specific number of days.

Limitations

Supported MySQL versions: 5.5 < 8.0; MariaDB ≥ 10.1.2

Only DDL syntax supported by TiDB parser

Upstream binlog must be enabled with binlog_format=ROW

DM does not support dropping multiple partitions in one statement or dropping indexed columns directly

DM‑portal Caveats

The portal auto‑generates task files but lacks full‑database regex matching, causing temporary tables from online DDL tools to be ignored; this was fixed in version 1.0.5.

Conclusion

From first exposure to TiDB in 2019 to becoming a core member, presenting at DEVCON 2020, and receiving the TUG most influential content award, the author emphasizes continuous sharing of technical knowledge. The article underscores the importance of proper DM configuration, online DDL support, and cleanup to maintain stable, high‑performance TiDB clusters.

data migrationPerformance TuningMySQLTiDBonline DDLdatabase synchronizationDM
360 Smart Cloud
Written by

360 Smart Cloud

Official service account of 360 Smart Cloud, dedicated to building a high-quality, secure, highly available, convenient, and stable one‑stop cloud service platform.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.