Big Data 8 min read

Big Data Dual-Stream Construction and High-Fidelity Pressure Testing Guidelines

This document outlines the standards, evaluation dimensions, and implementation process for dual‑stream construction in big‑data pipelines, describes high‑fidelity pressure‑testing methods and objectives, and provides migration procedures for business units not participating in the tests.

JD Retail Technology

Jul 27, 2023

Big Data Dual-Stream Construction and High-Fidelity Pressure Testing Guidelines

In the era of big data, many business decisions rely on real‑time data, prompting the need for dual‑stream (dual‑active) construction to ensure high availability during normal operations and peak promotional periods. The core data pipeline is built across two data centers, requiring duplicated resources and coordinated effort among producers, warehouses, processing nodes, and consumers.

Part 1 – Dual‑Stream Construction Evaluation defines seven assessment dimensions: system level (Level‑0 critical systems), task level (L0 real‑time tasks), physical resource consumption, data timeliness (≤20 min during peaks, ≤40 min otherwise), data volume peaks, production source deployment (must be dual‑center), and business impact scenarios. Each dimension includes specific standards and remarks to guide reasonable dual‑stream requests.

Part 2 – High‑Fidelity Pressure Testing ("Bottleneck Dam" Testing) explains the shift from module‑level tests to full‑link bottleneck testing, covering both transaction and traffic streams. Objectives are set based on historical peaks (e.g., 1.2× the 2022 Double‑11 peak). The testing scheme includes stopping synchronization tasks to “hold” transactions and pausing data collection services to “hold” traffic, while allowing non‑test business to consume real‑time data from a separate JDQ cluster. Detailed testing specifications describe timing, notification procedures, and avoidance of conflicts with other disaster‑recovery drills.

The document also addresses high‑fidelity testing for low‑volume pre‑sale orders by integrating an online military‑exercise scenario, creating shadow tables and separate data sources to avoid contaminating production data.

Part 3 – Migration Plans for Non‑Participating Business Units outlines how to switch clusters and topics during the bottleneck period. Transaction topics are dual‑active, allowing consumption from the non‑test data center. Traffic topics require switching to the non‑loss JDQ cluster JDQ4澜沧江_点击流新建流. The migration leverages a one‑click switch, requiring an upgrade to jdq4-clients:1.3.0‑SNAPSHOT and Flink versions 1.10/1.12/1.14‑1.0.9‑SNAPSHOT. Topic authentication changes are also described, ensuring that both transaction and traffic streams can be consumed from the appropriate non‑test topics.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

data pipeline high availability pressure testing dual-stream

Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.