Artificial Intelligence 38 min read

Deep Learning Practices for Click‑Through‑Rate Prediction and Ranking at 58.com

This article describes how 58.com applied deep‑learning techniques—including feature engineering, sample construction, model evolution from Wide&Deep to DIN/DIEN and multi‑task learning—and system‑level optimizations to improve CTR/CPM performance in its large‑scale commercial ranking platform.

DataFunTalk

Sep 3, 2020

Deep Learning Practices for Click‑Through‑Rate Prediction and Ranking at 58.com

In the context of internet commerce, advertising revenue dominates, and since the breakthrough of deep learning with AlexNet in 2012, neural networks have become mainstream for search, recommendation, and ad ranking.

58.com, the largest Chinese life‑information platform, operates a commercial middle‑platform serving real‑estate, recruitment, classifieds, and used‑car services. To balance user experience, advertiser ROI, and platform profit, the company built a deep‑learning‑driven ranking system for ad placement.

The ranking pipeline consists of offline and online components. Offline stages include feature computation, feature‑center synchronization, raw sample stitching, feature engineering, training‑sample generation, model training/evaluation, and model deployment. Online stages perform real‑time feature lookup, feature engineering, model scoring, smoothing calibration, rule‑based re‑ranking, and logging.

Feature engineering is divided into basic features (user, client, ad, context dimensions), high‑order features (embedding‑based representations inspired by Airbnb and word2vec), and bias features (position bias, freshness, and age). Techniques such as standardization, discretization, bucketization, and non‑linear transforms are applied.

Sample construction emphasizes consistency between offline training and online inference. A Kappa‑style Flink pipeline generates near‑real‑time samples, handling stateful and stateless features, incremental updates, and multi‑source joins, while supporting various sampling strategies (uniform, user‑PV, negative‑sampling, candidate‑sampling, etc.).

Model evolution progressed from Wide&Deep to DeepFM, DIN, DIEN, and multi‑task ESMM. Wide&Deep combines memorization (wide) and generalization (deep); DeepFM replaces the wide part with factorization machines to capture second‑order interactions; DIN uses attention to weight user‑history items relative to the candidate; DIEN adds a GRU for interest evolution; ESMM jointly predicts CTR and CVR to mitigate sample‑imbalance.

System engineering optimizations targeted three layers: application‑level pipeline refactoring (loop‑friendly code, cache consolidation, feature‑column usage), unified feature preprocessing (TensorFlow feature columns, bucketization, embedding handling), and model serving improvements (raw serving input, FP16 inference, MKL/DNN, XLA, and custom compilation). These changes yielded ~80% online inference speedup and ~70% training acceleration.

Experimental results show up to +10.8% CPM and +9.9% CTR improvements over baseline models, demonstrating that data quality, feature consistency, and efficient system design are as crucial as model architecture.

The authors conclude that continuous exploration in data, algorithms, and system engineering is essential for sustaining growth in ad ranking, and they outline future directions such as richer multimodal features, coarse‑to‑fine ranking integration, and support for larger models on CPU‑centric infrastructures.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

feature engineering deep learning CTR prediction System Optimization online advertising ranking systems

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.