Near Real-Time Metric System Architecture for Dongchedi Used Car Business
This article introduces Dongchedi's near real‑time metric system architecture, covering business background, technical challenges, the unified storage‑compute and query service design using the Las lakehouse built on Apache Hudi, solutions to consistency issues, achieved results, and future plans for further real‑time improvements.
The presentation outlines the business background of Dongchedi's used‑car platform, highlighting rapid growth of offline stores launched in May 2022 and the need for a reliable, near real‑time metric system to support both transaction‑focused (business) and traffic‑focused (media) data warehouses.
Key technical challenges include frequent system iterations, separate real‑time and offline architectures, duplicated codebases, and inconsistent metric definitions across services. These issues lead to poor metric consistency, a problem common in both mature and fast‑growing companies.
To address these challenges, the architecture adopts a unified storage‑compute layer and a unified query service. The storage layer is standardized using a lakehouse solution called Las , which is a heavily customized Apache Hudi engine providing scalable metadata services, Hive Metastore compatibility, and support for Insert/Overwrite, Update/Delete, Upsert, Append, and streaming source/sink capabilities.
Las presents data as regular Hive tables, enabling both Flink SQL (for streaming) and Spark SQL (for batch) reads and writes. It supports two storage types—COPY_ON_WRITE (rarely used) and MERGE_ON_READ—allowing delta‑log writes that are later merged into base files, with periodic compaction to improve query performance. Features such as multi‑stream column stitching, external‑store joins, custom aggregation UDFs, archival partitioning, incremental consumption, and a forthcoming Time‑Travel capability further enhance its suitability for near real‑time analytics.
The unified query service consolidates metric production and consumption. Metric definitions follow Alibaba's OneData decomposition method, binding metrics to business processes and dimensions. The Dataleap platform manages metric metadata, routing, and self‑service dashboard generation, while OneService handles detailed data not covered by the metric platform, and the Fengshen Agile BI platform offers Presto‑based ad‑hoc analysis.
The final architecture integrates data sources into Las tables, processes them via FlinkSQL or HSQL, and syncs results to ClickHouse. Additional daily snapshots are stored for historical analysis. Downstream applications include real‑time dashboards, data dashboards, subscription pushes, and self‑service analytics.
Results achieved so far: over 600 store metrics are now near real‑time (minute‑level), the used‑car store data pipeline fully supports stream‑batch integration, and metric consistency issues have been resolved. Planned improvements include achieving second‑level real‑time metrics, migrating storage to an in‑memory solution to eliminate checkpoint latency, and truly unifying the compute layer across languages.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.