Big Data 18 min read

Design and Evolution of Alibaba Advertising Real-Time Data Warehouse

Alibaba Mama’s advertising platform migrated from a monolithic Flink‑Kafka pipeline to a layered Paimon lakehouse, adding DWS upsert support and multi‑layer storage, which delivers minute‑level data freshness, cuts latency by 2.5 hours, reduces resource use over 40 %, halves development effort and achieves ≥99.9 % availability.

Alimama Tech
Alimama Tech
Alimama Tech
Design and Evolution of Alibaba Advertising Real-Time Data Warehouse

1. Business Background

Alibaba Mama operates a large‑scale advertising ecosystem covering search, display, and feed across Alibaba’s apps. Advertisers create campaigns via DSPs, and the system tracks impressions, clicks, conversions, and budget consumption. Real‑time data is critical for performance monitoring, fraud detection, and budget enforcement.

1.1 Advertising Data Scenarios and Requirements

External real‑time reports : metrics such as impressions, CTR, ROI.

Internal real‑time analysis : audience, product, region analysis.

Business monitoring : high‑traffic events (e.g., 618, Double 11) need instant feedback.

1.2 Technical Goals for the Data Warehouse

Data timeliness : minute‑level freshness, ideally zero‑delay.

System throughput : billions of events per day, tens of millions of TPS.

Stability : ≥99.9% availability, fast rollback.

Cost efficiency : minimal resources, reduced operational burden.

2. Architecture Design

The architecture evolved from a monolithic Flink‑Kafka pipeline to a layered lakehouse built on Paimon.

2.1 Why a Real‑Time Data Warehouse?

Initial design focused on speed, but faced issues: inflexible requirements, duplicated development, resource waste, schema‑less data, and high operational cost.

2.2 Evolution of the Real‑Time Warehouse

2.2.1 TT‑Based Real‑Time Warehouse

Data flow: ODS (raw logs) → TT → Flink → DWD (processed) → downstream stores (OceanBase, Hologres, ClickHouse). Problems included data duplication, missing DWS layer, lack of upsert, and duplicated offline pipelines.

2.2.2 Paimon Lakehouse Solution

Replaced DWD storage with Paimon, added a DWS layer supporting upsert and changelog consumption. Schema is now materialized, enabling direct queries for BI and algorithms. The same ODS data is ingested via Flink into Paimon tables.

2.2.3 Full Lakehouse Architecture

Four layers plus an ops platform: Data layer (TT → Paimon), Compute layer (Flink, MaxCompute), Storage layer (dual‑system for HA), Application layer (reports, BI, algorithms), and Operations platform (monitoring, stress testing).

2.2.4 Common Paimon Optimizations

Asynchronous compaction for higher write throughput.

Dynamic checkpoint intervals based on traffic.

Resource allocation matching bucket count.

File lifecycle management to avoid small‑file explosion.

Consumer ID retention to enable job recovery.

3. Benefits of the Lakehouse

Layered design reduced development time by >30%, cut data latency by 2.5 hours on average, and enabled minute‑level budget tracking. Resource consumption dropped >40%, and development effort fell by 50%, achieving near‑zero downtime with three‑nines availability.

AlibabaadvertisingFlinkStreamingPaimonreal-time data warehousedata lake
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.