Artificial Intelligence 14 min read

Adversarial Gradient Driven Exploration for Deep Click-Through Rate Prediction

The authors introduce AGE, an adversarial‑gradient‑driven exploration framework that injects uncertainty‑scaled perturbations into ad embeddings to approximate the downstream learning effect, combines Monte‑Carlo dropout uncertainty, a dynamic gating unit, and achieves up to 15 % offline gains and 6 % online CTR improvement over strong baselines.

Alimama Tech

Aug 24, 2022

Adversarial Gradient Driven Exploration for Deep Click-Through Rate Prediction

This work tackles click‑through rate (CTR) prediction in large‑scale advertising and recommendation systems, where cold‑start and data‑loop issues hinder model performance. The authors propose a novel exploration‑exploitation strategy called AGE (Adversarial Gradient driven Exploration) that simulates the downstream impact of explored samples on the model by adding adversarial perturbations to input representations.

Background: In streaming CTR models, the data used for training is generated by the deployed model itself, creating a feedback loop. New or long‑tail ads suffer from insufficient exposure, leading to high uncertainty and poor estimates. Traditional E&E methods rely on uncertainty as a proxy for potential reward (e.g., UCB, Thompson Sampling) but do not model how explored data influences subsequent learning.

Method: AGE introduces a Pseudo‑Exploration module that injects a fixed‑norm adversarial gradient—scaled by the model’s predicted uncertainty—into the input embedding, thereby approximating the change in the model’s score after training on the explored sample. A Dynamic Gating Unit (DGU) filters low‑value ads by comparing the predicted CTR with an estimated average CTR, preventing wasteful exploration. The overall architecture combines MC‑Dropout for uncertainty estimation, fast gradient (FGM) or projected gradient descent (PGD) for adversarial gradients, and the gating mechanism.

Implementation details: Uncertainty is obtained via Monte‑Carlo Dropout; adversarial gradients are computed without altering the original model parameters; the DGU uses only ad‑side features to predict average CTR and decides whether to explore.

Experiments: Offline benchmarks on public datasets show AGE outperforms random, gradient‑based, UCB, and Thompson Sampling baselines, achieving up to 15.3% relative gain over the best baseline and a 28% lift over a non‑exploration model. Ablation studies confirm the necessity of the pseudo‑exploration, adversarial gradient, and gating components. Online A/B tests on Alibaba’s display‑ad platform demonstrate significant improvements: +6.4% CTR, +3.0% PV, closer-to‑1 PCOC, and a 5.5% increase in advertiser satisfaction (AFR).

Conclusion: By modeling the closed‑loop effect of exploration, AGE provides a more effective E&E solution for CTR prediction, validated both offline and in production, and is slated for broader deployment.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Recommendation Systems Online Learning adversarial gradient Exploration-Exploitation

Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.