Big Data 21 min read

How Kuaishou Built a Scalable Big Data Platform with Unified Data Quality and Metric Services

This article details Kuaishou's end‑to‑end big data platform, describing its organizational model, unified data governance framework, comprehensive data‑quality solution, the design of a headless metric platform, key technologies such as automatic modeling and code generation, and future directions toward a decentralized, smart data fabric.

Kuaishou Big Data
Kuaishou Big Data
Kuaishou Big Data
How Kuaishou Built a Scalable Big Data Platform with Unified Data Quality and Metric Services

01 Kuaishou Big Data Platform Overview

Kuaishou, a leading short‑video and live‑streaming platform, treats big data as a core production force. The platform adopts a "Platform + Business Partner" organization, enabling horizontal support for all data services across the company and eliminating data silos. It follows a unified construction approach, designing a top‑down data blueprint and aligning teams to implement it.

The data warehouse follows a "1 horizontal + N vertical" model: a shared public layer plus business‑specific vertical layers that interconnect. Core data products include the data production system (TianGong), BI analysis (KwaiBI), app analysis, and AB testing platforms. Data governance, standards, and tools are embedded into workflows rather than remaining on paper.

The platform processes petabytes of daily data and exabytes of total data, handling hundreds of thousands of daily jobs, which presents significant quality challenges.

02 Kuaishou Data Quality Solution

Data governance aims to increase data value and forms the foundation of digital strategy, comprising organization, policies, processes, and tools. Poor data quality can lead to wrong decisions, wasted marketing spend, and loss of advertisers.

DAMA and other standards define six quality dimensions: completeness, uniqueness, timeliness, validity, accuracy, and consistency. Kuaishou implements a three‑layer "tool + standard + organization" solution, using a "data‑quality fault count" north‑star metric to drive improvements.

Pre‑deployment: automated testing (instrumentation tests, SQLScan, code tests, metric platform tools) to catch issues before release.

During operation: online monitoring of instrumentation, ETL jobs, and application consistency.

Post‑deployment: a one‑stop data‑governance platform that detects non‑compliant tasks and reports to owners.

The most challenging quality problems arise at the data‑consumption side, where inconsistent metric definitions and naming cause downstream issues.

03 Kuaishou Metric Platform Construction

The metric platform serves as middleware between the data lake/warehouse and data applications, providing a unified language for metrics and dimensions. It abstracts three aspects: data expression, a unified business language, and logical abstraction that decouples upstream engines from downstream users.

Key components include:

Unified metric management ensuring unique definitions and naming.

Unified metric monitoring that validates correctness and enforces SLAs.

Unified metric service (OneService) that offers a single entry point for all applications.

Technical highlights:

Unified Language Abstraction

A simplified SQL‑like language plus an orchestration DSL enable federated queries across engines (ClickHouse, Hive, etc.). Example:

<code><span>// Simplified SQL</span>
SELECT
   table_a.api_name,
   table_b.job_description
FROM CLICKHOUSE_CATALOG.public_hdd.one_service_job_df as table_a
JOIN HIVE_CATALOG.ks_data_factory.one_service_meta as table_b
ON table_a.id = table_b.job_id
WHERE table_a.id = ${job_id}

<span>// Orchestration DSL</span>
input {
  groupId: 111
  userName: "xiaozhang"
}
operators {
  operator op1 { type: REDIS, lang: OneSQL, query: "scan 0 match dp12:* count 20" }
  operator op2 { type: HIVE, lang: OneSQL, query: "select uid, name, age from people" }
  operator op3 { type: FEDERATION, lang: OneSQL, query: "select * from clickhouse.db.table a join mysql.db.table b on a.xx = b.xx join operator.op1 c on a.xx = c.xx" }
}
output { return: op3 }
</code>

Automatic Modeling

The platform automatically discovers table relationships based on metric and dimension metadata, handling multi‑path models, cumulative calculations, and model selection using RBO+CBO strategies (time range, engine preference, model count, depth, and granularity).

Code Generation

Abstract syntax trees are transformed into engine‑specific queries with optimizations such as virtual‑column to lookup conversion, filtered aggregation, countDistinctIf for ClickHouse, and join order adjustments.

04 Future of the Metric Platform

The industry is moving toward Data Fabric and modern data stacks, shifting from centralized, manual ETL to decentralized, self‑service analytics and SmartETL. Kuaishou aims to standardize this with a semantic data management whitepaper, open analysis languages (Kwai OAX, Kwai FQL), and intelligent features such as automated modeling, query optimization, and data acceleration.

Big Datadata qualitydata modelingData Governancemetric platform
Kuaishou Big Data
Written by

Kuaishou Big Data

Technology sharing on Kuaishou Big Data, covering big‑data architectures (Hadoop, Spark, Flink, ClickHouse, etc.), data middle‑platform (development, management, services, analytics tools) and data warehouses. Also includes the latest tech updates, big‑data job listings, and information on meetups, talks, and conferences.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.