Artificial Intelligence 18 min read

Full‑Link Consistency Testing for Click‑Through Rate Models in Large‑Scale Machine Learning

The article describes a comprehensive full‑link consistency testing framework for click‑through‑rate models, defining consistency issues, outlining data and logic consistency goals, and presenting a multi‑stage technical solution—including online data capture, offline data stitching, q‑value comparison, and reporting—to ensure model stability and performance.

Baidu Intelligent Testing
Baidu Intelligent Testing
Baidu Intelligent Testing
Full‑Link Consistency Testing for Click‑Through Rate Models in Large‑Scale Machine Learning

This document presents a detailed approach to full‑link consistency testing for click‑through‑rate (CTR) models used in advertising retrieval, emphasizing consistency as the foundation for model stability.

Background and Overview : CTR models estimate the probability of an ad being clicked, influencing ranking and truncation. The offline training pipeline involves feature extraction, model training, evaluation, and online deployment, while inconsistencies can arise across data, latency, policies, and performance, leading to metric fluctuations.

Definition and Goals of Consistency : Consistency issues are defined as mismatches in data (sample and model) and processing logic between offline training and online inference. Goals include detecting inconsistencies, locating their sources, and assessing their impact on system performance.

Technical Solution – Full‑Link Consistency Scheme : The solution breaks down the CTR prediction into multiple stages (parameter parsing, feature extraction, embedding lookup, DNN computation, etc.) and replaces each stage with a controlled offline counterpart to compare q‑values (predicted click probabilities). Five parallel flows (q1‑q5) are generated to isolate discrepancies in feature extraction, model conversion, and DNN computation.

Verification Flow : By comparing q‑values across flows (e.g., q1 vs q2, q3 vs q4), the framework pinpoints whether inconsistencies stem from offline feature extraction, model table conversion, or DNN calculation.

Online Data Acquisition and Debug Logging : Online logs are captured with debug flags, parsed by thread, and formatted to align with offline samples. The logs contain multi‑threaded entries, complete sample information, and may span multiple files.

Offline Data Stitching : Offline samples are matched to online logs using a primary key that appears in both datasets, ensuring a one‑to‑one correspondence for accurate q‑value comparison.

Q‑Value Computation and Reporting : The framework produces statistical reports (distribution, diff ranges) and detailed per‑sample diff information, including feature signatures and primary keys for manual investigation.

End‑to‑End Execution : The process is divided into six stages—traffic capture, log formatting, log stitching, online parsing, q‑value replacement/computation, and report generation—allowing selective execution based on troubleshooting needs.

Task Submission and Monitoring : Users can submit tasks via a platform, providing parameters for data processing and q‑value calculation. The system supports incremental execution, email notifications on failures, and visual dashboards for task status.

Results and Future Work : The solution supports various CTR models, improves debugging efficiency, and has already identified inconsistencies such as feature mismatches and misaligned network structures, guiding subsequent fixes and enhancements.

Data PipelineMachine Learningclick-through rateDNNmodel consistencyonline-offline testing
Baidu Intelligent Testing
Written by

Baidu Intelligent Testing

Welcome to follow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.