Practical Exploration of OCPC Advertising Algorithm at Phoenix New Media
This article presents a comprehensive overview of the OCPC (Optimized Cost Per Click) advertising algorithm deployed by Phoenix New Media, detailing its background, problem definition, two‑price mechanism, smart bidding, CVR estimation techniques, online learning architecture, challenges such as data sparsity and conversion delay, and future research directions.
Guest Speaker: Phoenix New Media Advertising Algorithm Team
Editor: Chen Daochang
Source: 2019 DataFun Live 01
Community: DataFun
Note: Feel free to repost with a comment in the discussion area.
Introduction: Advertising algorithms aim to improve platform revenue by optimizing click‑through rates and bidding mechanisms, but must also ensure quality and quantity of advertiser conversions to maintain ecosystem stability. Building on CPC, the OCPC model was explored and successfully applied in practice.
Agenda:
Background Introduction
CVR Estimation
Two‑Price Mechanism and Smart Bidding
OCPC Algorithm
Technical Architecture
Background – Phoenix New Media Platform (Fengyu): Fengyu is a high‑quality programmatic advertising platform of Phoenix TV, aggregating traffic from Phoenix.com, mobile sites, news and video apps, delivering about 2 billion premium impressions daily.
What is OCPC? OCPC is a recent performance‑advertising model that still charges per click (CPC) but optimizes for a target cost per acquisition (CPA). It consists of a cold‑start phase and a data‑accumulation phase, allowing advertisers to focus on ROI while maintaining volume.
Problem Definition: The goal is to increase conversion rate while reducing cost, thereby raising eCPM, and to keep cost and consumption stable under the bidding mechanism.
Three optimization levers are proposed:
Two‑price mechanism – pay the second‑highest price for won impressions.
Smart bidding – allocate traffic more aggressively based on target goals.
CVR (conversion‑rate) estimation – address extreme data sparsity and improve accuracy.
CVR Estimation: A Bayesian smoothing algorithm based on feature sub‑spaces generated by GBDT trees is employed. The workflow includes:
Train GBDT to obtain multiple feature sub‑spaces (trees).
For each ad, compute CVR in each sub‑space using a beta distribution and assess confidence.
Online learning to degrade to CPC when confidence is low.
Split and probe sub‑spaces during online learning.
Apply Bayesian smoothing across similar ads to improve generalization.
Key challenges include:
Very sparse conversion samples (often < 100 conversions) make reliable CVR estimation difficult.
Mismatch between click‑based and exposure‑based conversion rates.
Sample delay: different conversion actions have varying latency, requiring hour‑level calibration and multi‑day smoothing.
Two‑Price Mechanism & Smart Bidding: Smart bidding builds on the two‑price foundation to allocate volume to high‑CVR sub‑spaces while reducing volatility in cost and consumption, effectively balancing platform and advertiser interests.
Algorithm Details: The system uses online learning with probing and splitting strategies, confidence‑controlled exploration (UCB with Wilson score interval), and periodic offline recalibration. When data is scarce, the model falls back to CPC; as confidence grows, it transitions to OCPC.
Practical Results: In a week‑long A/B test, OCPC reduced CPC cost by ~30% and significantly lowered cost volatility compared to pure CPC.
Future Work:
Mitigate data sparsity by leveraging dwell time on landing pages.
Enhance model accuracy and generalization by incorporating GBDT predictions and richer feature sets.
Explore reinforcement‑learning based dynamic pricing for low‑traffic scenarios.
For further reading, see related articles on Weibo advertising architecture, Hulu video ad system practices, and the DataFun knowledge‑sharing platform.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.