Big Data 8 min read

Evolution of Payment Reconciliation Architecture: From MySQL to StarRocks with Flink and DataX

This article describes how a payment reconciliation system progressed from a simple MySQL‑based solution through a Hive‑based big‑data approach to a high‑performance StarRocks architecture, detailing the integration of Flink, DataX, and SQL adaptations that dramatically improved query speed, cost, and operational efficiency.

HomeTech
HomeTech
HomeTech
Evolution of Payment Reconciliation Architecture: From MySQL to StarRocks with Flink and DataX

Reconciliation is essential for enterprises to verify financial transactions, manage inventory, and assess business performance, requiring reliable cross‑system data matching among payment, order, finance, settlement, and rights systems.

1. Payment Closed‑Loop – Although end users see a simple order‑pay flow, the backend involves notifications, settlement, accounting, and report generation, forming a complex payment closed‑loop.

2. Reconciliation Architecture Evolution

2.1 Reconciliation 1.0 – All in MySQL – Data from multiple servers is synchronized to a single MySQL instance for cross‑database queries, but large tables (billions of rows) cause queries to take hours, making the approach inefficient.

2.2 Reconciliation 2.0 – Big Data (Hive) – Using ETL, data from various business systems is loaded into a data warehouse for cross‑table analysis, improving scalability but still suffering from latency and high resource consumption.

2.3 Reconciliation 2.0 Limitations – T+1 data sync and inability to handle adjustment or supplemental accounting scenarios limit its applicability.

3. Reconciliation 3.0 – StarRocks Acceleration

Facing growing data volume and finer‑grained billing requirements, the team evaluated OLAP solutions and selected StarRocks for its billion‑row join capability, sub‑second query speed, real‑time ingestion, and MySQL‑compatible protocol.

3.1 StarRocks Architecture – StarRocks replaces Hive, retaining most of the existing SQL‑based tasks with minor adjustments; scheduled jobs and data validation tasks ensure consistency between MySQL and StarRocks.

3.2 Model Selection – Choosing the appropriate data model (detail, aggregate, update, primary‑key) is crucial to avoid duplicate‑key issues that previously caused reconciliation errors.

3.3 Flink Real‑Time Synchronization – DataX handles bulk data migration, while Flink captures binlog for incremental sync; custom Flink jobs write directly to StarRocks when needed.

Key implementation steps include creating StarRocks tables, defining corresponding Flink tables, and building Flink SQL jobs to stream data.

4. SQL Adaptation – All 18 reconciliation scenarios required SQL adjustments; thanks to StarRocks’ MySQL compatibility, the effort was completed within a day, with only minor syntax changes for sub‑queries.

5. Results – The most complex reconciliation query dropped from 1 hour to 50 seconds, achieving 50‑70× performance gains, halving storage costs, and reducing daily manual effort from three people to one.

6. Summary – Manual data merging is being replaced by automated, scheduled reconciliation and reporting; leveraging StarRocks’ materialized views will further boost efficiency and sustain continuous improvement.

reconciliationbig dataFlinkSQLStarRocksETLpayment
HomeTech
Written by

HomeTech

HomeTech tech sharing

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.