Databases 6 min read

Understanding Canal, Maxwell, Databus, and Alibaba DTS for Incremental Data Capture

This article explains how Canal, Maxwell, Databus, and Alibaba's Data Transmission Service (DTS) enable incremental data subscription and consumption by parsing MySQL binlog streams, describing their architectures, processing steps, and comparative advantages for building reliable change‑data‑capture pipelines.

Top Architect
Top Architect
Top Architect
Understanding Canal, Maxwell, Databus, and Alibaba DTS for Incremental Data Capture

Canal

Canal is a Java‑based service that simulates the MySQL slave protocol to receive binary log (binlog) data from a MySQL master. It parses the raw byte stream of the binlog, extracts change events, and forwards them to an EventSink for storage, while periodically recording the current binlog position.

Canal pretends to be a MySQL slave and sends a DUMP request to the master.

The master pushes binary logs to Canal.

Canal parses the binary log bytes into structured events.

The overall parsing workflow includes obtaining the last successful position, establishing a connection with BINLOG_DUMP , receiving and parsing the binary log, passing events to the EventSink (a blocking operation), and recording the position after successful storage.

Additional features include data filtering (wildcards, table names, field content), routing/distribution (1:n), merging (n:1), and data enrichment (e.g., joins) before storage.

Maxwell

Maxwell is also written in Java and provides a server‑client architecture. Unlike Canal, Maxwell directly outputs data changes as JSON strings, eliminating the need to write a custom client to consume parsed events.

Databus

Databus is a low‑latency change‑capture system used by LinkedIn. It isolates sources from consumers, guarantees ordered and at‑least‑once delivery with high availability, supports consumption from any point in the change stream (including full back‑fill), provides partitioned consumption, and preserves source consistency.

Alibaba Cloud Data Transmission Service (DTS)

DTS is a cloud service that supports data exchange among RDBMS, NoSQL, and OLAP sources. It offers data migration, real‑time subscription, and synchronization, enabling scenarios such as zero‑downtime migration, disaster recovery, multi‑active regions, cross‑border sync, real‑time data warehousing, cache updates, and asynchronous notifications.

In practice, DTS acts like a message queue that pushes wrapped SQL objects, allowing developers to build services that parse these objects. It integrates tightly with Alibaba RDS and DRDS, handling binlog retention, primary‑secondary failover, and VPC network changes, and provides performance optimizations for RDS.

binlogCanalChange Data CaptureAlibaba DTSdata-captureDatabusMaxwell
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.