Big Data 21 min read

Data Weaving for AB Experiment Automation: Architecture, Challenges, and Solutions

This article presents a comprehensive overview of JD Retail's data‑weaving approach to AB experiment automation, detailing the challenges of consistency, scientific rigor, and timeliness, the logical data platform architecture, key technologies, metric modeling, automated DAG orchestration, current progress, and future directions.

DataFunSummit

Dec 1, 2024

Data Weaving for AB Experiment Automation: Architecture, Challenges, and Solutions

Introduction – The article introduces data weaving as a method to support AB experiment automation, emphasizing the shift from traditional BI dashboards to data‑driven experiment analysis.

Challenges in AB Experiment Scenarios

Consistency requirements: metric definitions must remain identical across consumption scenarios.

Scientific challenges: ensuring comparable samples and controlling error rates.

Timeliness: large‑scale daily experiments demand automated, low‑latency data pipelines.

Data‑Weaving Management Philosophy – Describes the four core components of data weaving for AB experiments: flow‑through dimension tables, single‑experiment computation, cumulative statistics, and metric inheritance.

Logical Data Platform Construction

Asset layer: factual and dimension tables.

Virtual layer: semantic models for logical definitions.

Materialization layer: on‑demand data asset materialization with degradation strategies.

Service layer: unified data services for multi‑endpoint consumption.

Key Technologies and Core Elements

Standard semantic definition.

Automatic logical orchestration (DAG assembly) and task deployment.

Materialization & degradation policies.

SQL optimization and self‑tuning.

Automation Architecture – Shows how the system automatically generates data links, aligns metric definitions, expands dimension tables, and builds logical tables without manual intervention, reducing the delivery cycle from weeks to days.

Metric Language and System – Explains the decomposition of metrics into derived and composite forms, the use of functions (e.g., year‑over‑year) and virtual modifiers, and how these are represented in the data‑weaving framework.

Logical Modeling & Data Acceleration – Details the steps of logical table widening, heightening, and selective materialization, illustrating how experiments trigger dynamic DAG nodes for daily and cumulative calculations.

Composite Metric DAG – Describes how multiple experiments sharing metrics are merged into a single logical table, dramatically reducing task count and improving scalability.

Current Progress and Future Outlook

60% of metrics are now auto‑computed with second‑level latency.

Future plans focus on expanding metric coverage, improving timeliness and performance at scale, enhancing experiment flexibility, and simplifying troubleshooting.

Q&A – Addresses how data governance and task merging are achieved through semantic parsing and shared logical tables.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AB testing Big Data

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.