Traffic Recording and Replay: Practices, Challenges, and Strategies for Backend Testing
This article shares practical insights on traffic recording and replay for backend testing, covering background, recording rules, scenario coverage, handling read/write interfaces, mock versus real replay, coverage metrics, and diff result comparison to improve system stability and test completeness.
Background : The testing team achieved zero production incidents in 2023 by applying meticulous work, risk control, and strict processes such as test case review, automation, integration, regression verification, regular stress testing, and high-fidelity promotion testing, especially leveraging extensive experience in traffic recording and replay.
Traffic Replay : From a system stability perspective, any change must not affect existing online functionality; DUCC switches provide quick mitigation. Traffic replay records real online traffic, replays it in a pre‑release environment, and compares sub‑calls and responses to pinpoint code issues. It offers low‑cost test case creation, zero code intrusion, realistic call chains, multi‑scenario coverage, traceable data, diff comparison, and precise problem localization, while cautioning against potential risks such as downstream traffic spikes or dirty data writes.
Traffic Recording : Real online traffic—both live and historical—serves as the source. The key is ensuring recorded traffic sufficiently covers the business scenarios impacted by code changes. Recording rules include field‑level filtering (visual configuration) or custom scripts for complex conditions, and de‑duplication to reduce redundant flows while maintaining interface coverage.
Scenario Coverage : R2 provides a coverage metric to reveal missing or unknown scenarios, essential for special cases like promotional events or holiday spikes that are not captured in regular traffic. Strategies include filtering for promotional parameters, extracting logs, or manually assembling traffic, with a focus on input parameters rather than output results.
Replay : Replay can be performed as offline diff or real‑time diff. Real‑time diff suits time‑sensitive business (e.g., billing) but may still encounter failures. Replay involves read, write, and read‑write interfaces. For read‑only interfaces, non‑mock replay validates behavior, but watch for future write changes. Write interfaces require caution due to side effects; mock responses or DUCC switches can prevent data pollution, and shadow environments (e.g., MySQL/Redis shadow tables) can be used for safe writes.
Mock vs. Non‑Mock Replay : Mocking is not directly tied to read/write nature; both can be used depending on test goals. Mock replay is a white‑box approach, while non‑mock replay is black‑box.
Coverage Statistics : R2 supports offline replay code coverage, generating reports to verify whether traffic recording is comprehensive and to identify blind spots.
Diff Result Comparison : Flexible diff strategies (ignoring fields, focusing on key outputs) reduce debugging effort. Inconsistent diffs may stem from configuration changes or genuine bugs, requiring log analysis and prompt fixes. Ultimately, replay enriches test cases with real traffic, enhancing test precision.
The author acknowledges being a testing novice and welcomes feedback to improve the content.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.