Backend Development 15 min read

Configurable Data Reconciliation Platform at Youzan: Design, Architecture, and Implementation

Youzan built a configurable data reconciliation platform that integrates new scenarios, processes massive real‑time and batch data, offers visual monitoring, automated correction, and flexible Groovy‑based logic across four DDD layers, achieving 99.99% stability while simplifying detection and resolution of cross‑system inconsistencies.

Youzan Coder
Youzan Coder
Youzan Coder
Configurable Data Reconciliation Platform at Youzan: Design, Architecture, and Implementation

Reconciliation, in the narrow sense, means verifying accounts; in a broader sense, it refers to data comparison to resolve inconsistencies across distributed systems. Youzan, a SaaS company with millions of merchants and tens of millions of daily business records, requires a 99.99% system stability. This article introduces a configurable data reconciliation platform that quickly discovers, displays, and resolves inconsistencies.

Background

As Youzan’s business grows, data inconsistency cases increase, especially at the boundaries of transaction, logistics, and marketing systems. Typical scenarios include:

Delivery order cancellation not reflected in the third‑party Dada system.

Promotional coupons not delivered after a purchase.

Order payment succeeded but status remains "Pending Payment".

These illustrate the pain points:

Complex business scenarios make timely detection difficult.

Rapid feature iteration leads to a surge in customer‑service tickets, making issue handling passive.

Hard‑coded reconciliation logic increases development cost when business changes.

Lack of systematic documentation hampers post‑mortem analysis.

Design Goals

Easy integration of new reconciliation scenarios.

Real‑time processing of massive data to detect inconsistencies promptly.

Offline batch comparison for full‑scale data verification.

Flexible adjustment of reconciliation logic for fast‑changing business.

Rich visual charts for monitoring comparison status.

Snapshot of reconciliation chain for quick issue location.

Automated data correction tools to reduce manual effort.

Stability reports for each business line.

Overall Architecture

The platform follows DDD and consists of four layers:

Access Layer : Handles backend operations and unified scheduling.

Application Layer : Integrates multiple domain services for business use.

Domain Layer : Core reconciliation logic, including support, core, and domain models. It covers data loading, assembly, task scheduling, execution, result storage, and alerting.

Infrastructure Layer : Provides persistence, messaging, task switches, blacklist/whitelist, etc.

Core Reconciliation Process

It is divided into three stages: data preparation, logical comparison, and error handling.

Data Preparation

Supports multiple data ingestion methods:

Upload: Excel files are converted according to custom field rules and stored in a source data pool.

Pull: Dubbo or HTTP interfaces poll full or incremental data.

Push: Data warehouse rules write data into the source pool.

Data is stored in a source pool with isolation rules such as business data type (e.g., retail‑spu) and versioned execution cycles.

Data is standardized to JSON and deserialized via Map.class for downstream processing.

Example of a standardized JSON record:

{
  "dataAttributes":{
    "deductFee":"0.0",
    "orderNo":"YZ202011122334455",
    "deliveryFee":"10.0",
    "tipFee":"0.0"
  },
  "uniqueKey":"YZ202011122334455"
}

Logical Comparison

The core step where business scripts (written in Groovy) compare the left‑hand (baseline) and right‑hand (target) data. Scripts are loaded from the database, executed in a sandbox, and can be mocked for testing.

Sample Groovy script for order status validation:

if (order == null) {
    return;
}
def orderStatus = order.orderStatus;
def errorMsg;
if (orderStatus != "WAIT_SHIPPED" && orderStatus != "SHIPPED") {
    errorMsg = "订单号:" + order.orderNo + ",订单状态未流转至待发货,当前状态为:" + orderStatus;
}
return errorMsg;

The platform supports both one‑way and two‑way reconciliation, as well as detail‑level and aggregate‑level comparisons.

Error Handling

Inconsistent results trigger visual displays, retries, alerts, correction workflows, and downloadable reports. Alerts integrate with Youzan’s unified alert system (phone, SMS, email, WeChat). Corrections can be performed via RPC generic calls or asynchronous messages.

Reports are generated based on custom transformation rules and can be exported in various formats.

Key Features

Configurable reconciliation tasks (basic info, trigger timing, external data loading, alert configuration).

Mock testing tool for parameterized verification and visual execution snapshots.

Trend statistics on reconciliation results (time‑dimension and business‑dimension metrics).

Real‑time and offline execution modes.

Technical Challenges

Current challenges include interface rate‑limiting during high‑throughput real‑time checks (solved with Guava RateLimiter ) and retry strategies for RPC failures.

Future challenges involve multi‑stream real‑time joins to reduce backend calls and AI‑driven recommendation of reconciliation rules based on historical data.

Future Outlook

The platform will continue to enhance high‑throughput, configurability, and visualization, forming a closed‑loop for data stability. Interested engineers can contact [email protected].

system architecturebig dataBackend Developmentdata reconciliation
Youzan Coder
Written by

Youzan Coder

Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.