Big Data 10 min read

Construction and High-Fidelity Load Testing of Real-Time Data Dual-Stream

This article explains how to build a dual‑stream real‑time data pipeline for big‑data applications, defines construction standards, and details a three‑step high‑fidelity load‑testing process that ensures stability and high availability during peak promotional periods.

JD Tech

Sep 11, 2023

Construction and High-Fidelity Load Testing of Real-Time Data Dual-Stream

In enterprise operations, real‑time data supports marketing, operations, and decision‑making; many large companies maintain a real‑time data pipeline. This article discusses how to build a dual‑stream real‑time data pipeline, defines construction standards, and describes a three‑step high‑fidelity load‑testing methodology for big‑data scenarios.

1.1 Data Dual-Stream

In the era of big data, increasing business functions rely on real‑time data for decisions such as promotion adjustments, click‑through rate estimation, and ad revenue sharing. To ensure stability, many Tier‑0 systems adopt dual‑stream architectures, providing both daily and peak‑period data flow.

Core data link is built with dual‑datacenter, dual‑active‑active streams, requiring all components (producers, warehouses, processing nodes, consumers) to be deployed in two sites, which doubles physical resources and coordination cost.

1.2 Evaluation Dimensions and Standards for Dual-Stream Construction

2.1 Dual-Stream Load Testing (Bottleneck‑Dam)

From the 2021 major‑sale preparation, the core data link shifted from module‑level testing to full‑link bottleneck‑dam testing, raising the test scope to cover both traffic and transaction peaks, simulating high‑fidelity promotion scenarios.

2.2 Definition of Load‑Testing Targets

Targets are set based on historical peaks and market forecasts, e.g., 1.2× the 2022 Double‑11 peak. Key data‑flow topics receive estimated consumption peaks for downstream reference.

2.3 Load‑Testing Scheme

(1) Transaction bottleneck: stop synchronization tasks to “hold” orders; architecture shown in Figure 1.

Figure 1. Transaction dual‑stream architecture

(2) Traffic bottleneck: stop data collection service writes to JDQ write cluster, creating a “no‑loss” traffic test environment (JDQ4 Lan Cang River_Click‑Flow_New).

Figure 2. No‑loss traffic test architecture

2.4 Load‑Testing Specifications

Details on start/end times, notification windows, coordination with group reporting, and avoidance of conflicts with other disaster‑recovery drills.

2.5 High‑Fidelity Testing for Distorted Scenarios

Pre‑sale orders have low peak ratios; to achieve high fidelity, a joint online‑military‑exercise test was built, integrating shadow tables and separate data sources.

Figure 3. Pre‑sale link test architecture

3.1 Impact on Business During Bottleneck Testing

During bottleneck periods, the data centers (Huetian/Langfang) stop real‑time data delivery; after “release” normal flow resumes. Non‑participating services must switch to appropriate topics.

3.2 Migration Plan for Non‑Participating Services

(1) Switch clusters: transaction topics are dual‑active; consumers can switch to non‑test datacenter topics.

(2) Switch topic authentication: similar approach for traffic topics; requires JDQ SDK upgrade (jdq4‑clients 1.3.0‑SNAPSHOT, Flink 1.10/1.12/1.14‑1.0.9‑SNAPSHOT). If the JDQ4 Lan Cang River_Click‑Flow_New cluster is not visible, contact operations support.

Conclusion

The article shares practical experience on building real‑time data dual‑stream pipelines and conducting high‑fidelity load testing for major promotions, offering guidance for ensuring stability and high availability of real‑time data links.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

high availability real-time data load-testing dual-stream

Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.