Streaming Graph Processing in Ant Group Real‑Time Data: Architecture, Applications and Future Outlook
The article presents Ant Group's real‑time data platform, explains its streaming graph processing architecture, and demonstrates how it powers traffic attribution, real‑time OLAP and user‑behavior intent analysis, while outlining future directions for the technology.
Ant Group has built a mature real‑time data system that combines streaming graph processing with large‑scale data analytics to handle vertices and edges in continuously arriving data streams.
The overall capability diagram is divided into three layers: foundational technologies (compute, storage, messaging), real‑time core capabilities (architecture & development paradigm, data assets, solution), and business solutions (marketing, risk control, etc.). The streaming graph engine sits in the core layer, built on Kubernetes and Ray, and provides GraphView API, a unified graph engine, and graph state management.
In the traffic‑conversion attribution scenario, a funnel model is used to trace user paths from public exposure to private domain and finally to transaction conversion. A conversion‑attribution model defines nodes such as path start, cut points, valid/invalid conversion nodes, and path end, enabling real‑time attribution through the streaming graph engine.
The real‑time attribution architecture integrates client‑side event collection and server‑side logs, builds a graph in real time, and outputs attribution results to downstream MQ and OLAP systems, achieving sub‑minute latency for most use cases.
For real‑time OLAP, Ant Group explores both pre‑computation and post‑computation modes. Post‑computation based on streaming graphs allows on‑demand feature creation without separate Flink jobs, improving flexibility and fault tolerance. Comparative tests show streaming graph engines excel in multi‑table join scenarios.
In user‑behavior intent analysis, streaming graphs construct real‑time intent scores for products by aggregating page visits, clicks, and exposure data, providing white‑box, low‑latency predictions that outperform traditional recommendation methods.
The future roadmap includes expanding streaming‑graph‑based real‑time OLAP in marketing, extending intent analysis to finance and content, applying real‑time attribution to ad‑chain diagnostics, and contributing to open‑source graph projects.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.