Dolphin Streaming: Real-Time SQL-Based Data Development Platform for Alibaba Advertising
Dolphin Streaming provides Alibaba’s advertising merchants with a DB‑like, SQL‑driven real‑time data platform built on Flink that abstracts storage and compute, enabling non‑engineers to develop, debug, and deploy streaming feature jobs quickly, boosting query volume, QPS, and revenue.
Background: Alibaba’s advertising business needs more timely data to improve merchant experience. Traditional offline data updates (T+1, T+7) are insufficient, prompting the data engine team to explore real‑time data processing.
Business problem: B‑side algorithm developers lack a user‑friendly real‑time development platform. Existing services require deep engineering knowledge of data sources, storage engines (Igraph, Lindorm, Hologres) and Flink APIs, making it hard for non‑engineers to create real‑time jobs.
Industry solutions: Internal platforms such as AMC Feature Center and Ant Feature Service Platform provide feature management but are engineer‑centric. External solutions like Cloudera Stream Builder and RisingWave offer SQL‑based streaming but still involve complex configuration.
Technical solution: Dolphin Streaming, an integrated compute‑storage engine built on Flink, offers a “DB for Streaming” experience. It abstracts data sources, storage, and Flink semantics, exposing a simplified SQL dialect that hides TUMBLE, HOP, and storage details.
Key design points:
Minimal SQL syntax for algorithm users.
Transparent underlying compute and storage engines.
End‑to‑end workflow from data ingestion to online feature serving.
Architecture: The platform standardizes real‑time behavior, builds a unified data map, and connects a public data layer, a middle‑layer for merchants, and application layers. Dolphin Streaming uses Flink for computation and supports multiple storage back‑ends (GP, Hologres, Igraph).
SQL translation example:
SELECT a.id, a.action_list, b.city, b.level
FROM realtime_feature_table a
JOIN offline_feature_table b ON a.id = b.idJob scheduling and debugging are provided via OpenAPI, allowing creation, start, stop, pause, and status queries, as well as ad‑hoc SELECT‑based debugging without pre‑defining sources.
Application examples: Integrated into the Aurora development platform, enabling merchants to develop, debug, and deploy real‑time feature jobs, query results instantly, and perform data exploration.
Business impact: Supports scenarios such as Direct Train, Gravity Cube, and product recommendation, achieving QPS > 6000, doubling query volume, and delivering significant revenue and user growth during major promotions.
Conclusion: Dolphin Streaming delivers a user‑centric, DB‑like streaming solution that reduces engineering overhead, accelerates iteration, and scales real‑time data development for Alibaba’s advertising merchants.
Alimama Tech
Official Alimama tech channel, showcasing all of Alimama's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.