Big Data 13 min read

Applying Flink State Management to Real‑Time Recommendation Scenarios

This article explains how Flink's flexible state management, including Broadcast, Keyed, and Operator states, can be used to solve real‑time recommendation challenges such as per‑minute UV, click, and exposure counting, while addressing locality mapping and data‑delay issues with Druid as the downstream store.

58 Tech
58 Tech
58 Tech
Applying Flink State Management to Real‑Time Recommendation Scenarios

Flink, a pure stream‑processing engine, offers richer state management than Spark Streaming, making it more suitable for real‑time recommendation workloads that require low‑latency aggregation of UV, click, exposure, and delivery metrics across multiple geographic dimensions.

The main challenges are (1) the lack of hierarchical mapping for a 13‑digit localId field, which prevents deriving province, city, and county information, and (2) severe reporting delays for real‑time exposure data caused by the client‑side reporting mechanism.

After evaluating Spark, Storm, and Flink, the team selected Flink for its true stream processing, credit‑based back‑pressure, and advanced state APIs. Broadcast State is used to distribute the locality mapping table, while Keyed and Operator states (Managed and Raw) store intermediate results, with checkpoints ensuring fault tolerance and exactly‑once semantics.

Data flow follows a Lambda architecture: raw events are ingested into Kafka, Flink reads from Kafka, processes the streams (including broadcasting the locality table, handling state updates in processBroadcastElement and processElement), and writes the enriched results to another Kafka topic for downstream consumption by Druid, which provides near‑real‑time aggregation and visualization.

Implementation steps include reading the locality table from HDFS, creating a MapStateDescriptor for broadcasting, connecting the broadcast stream with the main Kafka stream via BroadcastConnectedStream, and defining a custom BroadcastProcessFunction that overrides processBroadcastElement and processElement to update state and emit results.

Finally, the processed data are sunk to Druid, which handles time‑segmented storage and mitigates the exposure‑delay problem by aggregating based on event time, allowing late‑arriving data to be incorporated progressively.

Experience summary: Flink’s state management simplifies data correlation in both real‑time and batch scenarios, but for special cases it may be beneficial to combine Flink with other big‑data components to achieve optimal development efficiency and performance.

Reference: Apache Flink official documentation on state management.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Flinkreal-time analyticsstate-managementKafkaDruidBroadcast State
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.