Artificial Intelligence 37 min read

Comprehensive Guide to pCTR Modeling, Optimization, and Online Learning in Real‑Time Advertising Systems

This article presents a three‑part technical guide covering the fundamentals of computational advertising and real‑time bidding, detailed offline pCTR model training pipelines with feature engineering, calibration and model structure improvements, and advanced online learning techniques such as parameter freezing, sample replay and knowledge distillation, all aimed at boosting CTR performance and reducing bias in large‑scale ad platforms.

IEG Growth Platform Technology Team
IEG Growth Platform Technology Team
IEG Growth Platform Technology Team
Comprehensive Guide to pCTR Modeling, Optimization, and Online Learning in Real‑Time Advertising Systems

Author: coreyzhong, Tencent IEG Growth Platform Application Researcher

This document is divided into three parts: Part 1 introduces the overall advertising system and why pCTR modeling matters; Part 2 describes the offline pCTR training pipeline and various model improvements; Part 3 discusses online learning and further optimization for production systems.

Part1

Computational Advertising

Internet advertising is a key monetization and promotion channel. The industry has evolved from contract ads to precise targeting and finally to automated real‑time bidding, which is now the dominant delivery mode.

In the ad ecosystem, advertisers, media, SSP (supply‑side platform) and DSP (demand‑side platform) interact. A mature ad platform must support the full stack shown in the diagram below.

The core of the platform is the DSP’s ad‑bidding system, where the pCTR model plays a crucial role.

Ad Bidding System

Every ad request must be processed within 100 ms: recall, coarse‑ranking, fine‑ranking, and bidding.

Recall : Retrieve a few hundred candidate ads from a library that may contain millions of ads; full‑library scoring would be too costly.

Because computing pCTR/pCVR for every ad is prohibitive, the system first recalls a manageable subset.

Coarse‑ranking : Further reduces candidate size with a model of intermediate complexity, balancing speed and accuracy.

Fine‑ranking : Computes exact pCTR and pCVR for the remaining candidates, calculates ECPM, and sorts.

Bidding : Adjusts ECPM with business‑level strategies (budget smoothing, ROI optimization, etc.) before returning the winning ad.

ECPM and Fine‑ranking

ECPM (Effective Cost Per Mille) measures the monetary value of an impression. The basic formula is:

price is the advertiser‑set bid, while pCTR and pCVR are predicted by the respective models. Business‑specific modifiers (e.g., ROI, LTV) can be incorporated.

During fine‑ranking, the system computes pCTR and pCVR for ≤ 300 recalled ads, calculates ECPM, and selects the highest‑value ad for display.

Part2

The pCTR model estimates the pCTR term in the ECPM formula to enable accurate ranking.

Model Training Pipeline

The orange part is the online serving flow; the blue part is the offline model update flow.

Online Prediction Service

Ad SDK sends a request to the SSP.

SSP filters the request (resource slot, anti‑fraud) and forwards it to the DSP.

DSP fetches features from Redis and calls TensorFlow‑Serving to obtain model predictions.

DSP computes final ECPM and returns it via SSP to the SDK for display.

Offline Model Update

Feature calculation using Spark/Flink, stored in Redis and TDW.

Sample stitching: request‑level features are joined with click logs to create training samples stored in HDFS.

Model training: a daily job reads the stitched data, trains a new model, and saves it with a timestamp version.

Model deployment: Docker + tf.Serving watches a shared CFS path; copying a new model triggers automatic loading of the latest version.

Model Optimizations

Feature Embedding

Features are categorized as continuous, discrete (ID), or sequence. They are embedded, multiplied by their value/weight, and then concatenated or fed into deeper networks.

Embedding + value/weight multiplication preserves feature importance.

Feature Pre‑processing

Highly skewed features (e.g., 90‑day game payment) are transformed using normalization, standardization, or truncation, each with its own drawbacks. A custom transformation compresses extreme values while preserving discriminative power:

Feature Crossing

Manual crossing : Hand‑crafted cross features (e.g., user‑game click rate) reduce model learning difficulty.

Adding >30 user‑cross features and >80 user‑attribute cross features increased AUC from 0.8032 to 0.8105 and CTR by 5.6%.

Model‑level crossing : DeepFM → AutoInt → DCN variants were iteratively adopted, each yielding incremental AUC gains.

Feature Redundancy & Collinearity

Highly correlated features (e.g., 24 h vs. 72 h click counts) cause over‑fitting and unnecessary parameters. The team removes collinear features after extensive offline experiments, balancing AUC stability and model capacity.

Sequence Modeling

Six sequence features (e.g., user‑game click sequence) are embedded, max‑pooled, and passed through DIN‑Attention before concatenation.

DIN‑Attention yields a stable AUC lift of 0.0012.

Multi‑Task Modeling

Auxiliary tasks (e.g., video play completion) are added via an MMoE layer to improve the primary CTR objective.

Material Features

Embedding recent ad material IDs using a Word2Vec‑style co‑occurrence model adds ~0.0003 AUC.

Video‑level tags/embeddings further improve AUC by 0.0015 (details omitted).

Final Model Architecture

The final production diagram combines Seq‑Tower, MMoE‑Tower, Wide components, and the calibrated pCTR model.

Model Calibration

Why Calibrate?

Bias (systematic over‑ or under‑prediction) harms both revenue and CTR. Accurate ECPM calculation requires unbiased pCTR estimates.

Bias sources include sample sampling, cold‑start embeddings, data distribution shift, and local minima.

Scaling Calibration Formula

The simple scaling formula from Facebook’s "Practical Lessons from Predicting Clicks on Ads" is:

It is unbiased, bijective, and continuous, offering advantages over isotonic regression and Platt scaling.

Improved Calibration Methods

Field‑aware Calibration (TEG) augments isotonic regression with a DNN to handle bucket‑wise bias while preserving continuity.

Part3

Online Learning

Online learning updates the model in minutes or seconds using real‑time samples, addressing cold‑start and distribution drift.

Overall Mechanism

The loop consists of DSP prediction, real‑time feature/sample joining, online model training, and model deployment/roll‑back.

Challenges

Real‑time samples cannot be globally shuffled, leading to distribution gaps (e.g., morning vs. afternoon traffic) and catastrophic forgetting when naively fine‑tuning.

Parameter Freezing

Only embedding layers are updated online; FC and Cross layers are frozen. This prevents forgetting and yields a modest AUC gain (~0.0007) and ~2.8% CTR lift in production.

Sample Replay

To mitigate intra‑day performance drops, a portion of yesterday’s samples (later‑hour data) is mixed with current real‑time data, stabilizing AUC and improving CTR by 3‑6%.

Distillation Learning

A teacher‑student setup uses the offline model as teacher and the online model as student, adding a cross‑entropy loss (weight 0.2). This consistently raises AUC by ~0.0025 and further boosts CTR.

Soul‑Searching Questions

AUC vs. CTR

High AUC does not always translate to higher CTR. Early AUC gains often come from user‑side features (Base AUC) that improve request‑level ranking but not intra‑request item ranking. Once Base AUC is reached, further AUC improvements (via cross features) have a strong positive impact on CTR.

Bias vs. CTR

Bias affects the quality of won impressions. Over‑estimation leads to buying low‑quality traffic (higher volume, lower CTR); under‑estimation discards marginally profitable impressions (lower volume, higher CTR). The ideal state is zero bias.

AB Testing Pitfalls

Device‑level split requires AA testing to avoid inherent group differences, while request‑level split eliminates such bias but may not suit long‑term user studies. Simpson’s paradox can mislead conclusions if one group disproportionately serves higher‑quality items.

Overall, the article provides a thorough roadmap from foundational ad‑system concepts to sophisticated model training, calibration, and online learning techniques that together drive reliable CTR prediction and revenue optimization.

advertisingmachine learningfeature engineeringCTR predictiononline learningmodel calibrationpCTR
IEG Growth Platform Technology Team
Written by

IEG Growth Platform Technology Team

Official account of Tencent IEG Growth Platform Technology Team, showcasing cutting‑edge achievements across front‑end, back‑end, client, algorithm, testing and other domains.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.