Big Data 15 min read

Design and Implementation of the Mahé Real-Time Product Selection System Using Blink Stream Computing

Mahé, Xianyu’s real‑time product selection platform, uses Alibaba’s Blink stream engine to merge, evaluate roughly 300 rule‑based filters per item and emit only changed results, processing 1.4 billion daily messages at up to 50 k TPS through a four‑layer, stateful architecture.

Xianyu Technology
Xianyu Technology
Xianyu Technology
Design and Implementation of the Mahé Real-Time Product Selection System Using Blink Stream Computing

Background: In e‑commerce operations, selecting high‑quality items from a massive catalog in real time is crucial for user growth and GMV.

The Mahé system, developed by Xianyu, is a high‑performance, real‑time product selection platform that filters billions of items using rule‑based matching within seconds.

Stream Computing: Mahé relies on Blink, Alibaba’s enhanced Flink engine, which provides low‑latency, high‑throughput stateful stream processing. Blink’s state, window, and UDX (user‑defined functions) are leveraged.

Key Blink features:

State – snapshot of intermediate results, enabling per‑item data merging and rule evaluation.

Window – tumble and hop windows for time‑driven aggregations.

UDX – UDF, UDTF, UDAF for custom logic, especially rule evaluation.

Architecture: The data processing module consists of four layers – data ingestion, data merging, rule execution, and result handling. Each layer is designed for stream‑native processing.

Data Ingestion Layer: Connects multiple business data sources, parses and validates messages, enriches with metadata, and assembles a unified format:

{
  key: [timestamp, value]
}

Data Merging Layer: Merges incoming records with in‑memory state based on timestamps, ensuring the latest field values are retained.

Rule Execution Layer: Retrieves active rules, parses data according to metadata, and evaluates each rule using Blink’s compute engine. Approximately 300 rules run per item change.

Result Handling Layer: Diffs current rule outcomes with previous ones stored in state, emitting only the effective changes to reduce output volume.

Performance: The system processes ~1.4 billion messages daily, peaks at 50 k TPS, and supports hundreds of campaigns.

The design is generic and can be applied to other real‑time filtering scenarios beyond product selection.

rule enginebig dataFlinkstream processingBlinkreal-time selectionStateful Computation
Xianyu Technology
Written by

Xianyu Technology

Official account of the Xianyu technology team

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.