Big Data 15 min read

Streaming Graph Processing in Ant Group: Real-Time Data Architecture and Applications

This article presents Ant Group's comprehensive real-time data framework and streaming graph processing engine, detailing its architecture, unified batch‑stream capabilities, and practical applications such as traffic attribution, real‑time OLAP, and user‑behavior intent analysis, while outlining future directions.

DataFunTalk
DataFunTalk
DataFunTalk
Streaming Graph Processing in Ant Group: Real-Time Data Architecture and Applications

In the big data domain, streaming graph processing combines graph computation and stream processing to handle real‑time data flows, analyzing relationships between vertices and edges for continuous insight.

Ant Group has a mature real‑time data system; this talk covers its overall real‑time data architecture, key technologies, and applications of streaming graph processing in traffic attribution, real‑time OLAP, and user‑behavior intent analysis.

Agenda:

Ant Group real‑time data overview

Streaming graph processing in traffic attribution

Exploration of streaming graph processing in real‑time OLAP

Exploration of streaming graph processing in user‑behavior intent analysis

Future outlook

Ant Real‑Time Data Overview

The framework consists of three layers: foundational technologies (compute, storage, messaging), real‑time core capabilities (architecture & development paradigm, data assets, solutions), and business solutions (marketing, risk control, etc.). The streaming graph engine resides in the foundational layer.

Unified Batch‑Stream (Flow‑Batch Integration)

This paradigm allows a single codebase to process both real‑time streams and batch data, reducing development effort. Engines like Apache Flink and Spark support this capability.

Ant Streaming Graph System (TuGraph‑Analytics) Architecture

The system comprises container resources (Kubernetes, Ray), the streaming graph engine (GraphView API, Unified Graph Engine, Graph State), and data applications (traffic attribution, real‑time OLAP, intent analysis). It aims to provide an integrated solution for efficient real‑time data processing.

Application: Real‑Time Traffic Attribution

The traffic conversion funnel moves users from public to private domains, then to transaction conversion, enabling commercial value extraction. The attribution model defines nodes such as path start, cut points, effective/ineffective conversion nodes, and path end, producing trimmed conversion chains.

The technical stack includes real‑time data collection (client and server), streaming graph construction, and attribution path calculation, with results output to downstream MQ and OLAP.

Application: Real‑Time OLAP

Three computation modes are discussed: pre‑computation, pre‑wide‑post‑aggregate, and post‑aggregation. Post‑aggregation (post‑computation) enables flexible, fault‑tolerant analytics without upfront processing, improving feature development efficiency.

In marketing scenarios, post‑computation reduces wasted pre‑processing and supports on‑demand analysis, offering higher flexibility and lower latency for ad‑hoc queries.

Application: Real‑Time User‑Behavior Intent Analysis

By constructing real‑time graphs of user actions in financial services, the system assigns intent scores to nodes, identifying likely product interests and enabling targeted marketing.

Compared to traditional recommendation algorithms, this approach offers higher timeliness, white‑box transparency, noise reduction, and efficient post‑computation.

Future Outlook

Promote real‑time OLAP in marketing scenarios

Expand real‑time intent analysis to finance and content domains

Explore real‑time attribution for ad‑link diagnostics

Contribute to open‑source streaming graph projects

Thank you for attending.

Big Datareal-time dataOLAPGraph ProcessingStreaming Graphuser intent
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.