Artificial Intelligence 21 min read

Advances in Recommendation Models: CTR Prediction, Continuous Feature Embedding, Interaction Modeling, and Distributed Training

This article reviews the evolution of recommendation models from early collaborative filtering to modern deep learning approaches, discusses core challenges such as CTR prediction, outlines user‑behavior and combination‑feature modeling techniques, introduces large‑embedding training and continuous‑feature embedding methods like AutoDis, and presents distributed training frameworks such as ScaleFreeCTR, concluding with future research directions.

DataFunTalk
DataFunTalk
DataFunTalk
Advances in Recommendation Models: CTR Prediction, Continuous Feature Embedding, Interaction Modeling, and Distributed Training

The presentation begins with a historical overview of recommendation models, starting from 2006 collaborative filtering, matrix factorization, and topic models, moving through generalized linear models, factorization machines, and deep learning architectures like FNN, PNN, DIN, Wide&Deep, and DeepFM, and finally highlighting the recent shift toward reinforcement‑learning‑based recommenders.

CTR prediction is identified as the central problem in recommendation systems; accurate click‑through‑rate estimation drives revenue and user experience. A 2021 IJCAI survey categorizes deep CTR models into three groups: combination‑feature mining, user‑behavior modeling, and automated architecture search.

User‑behavior modeling advances include Alibaba's DIN, DIEN, and BST models, which incorporate pooling, RNNs, and Transformers to capture sequential user interests, as well as SIM and UBR approaches that retrieve relevant behavior embeddings based on the prediction target.

Combination‑feature modeling is divided into three families: naive (e.g., FNN), memorized (e.g., Wide&Deep, explicit cross features), and factorized (e.g., IPNN, DCN, xDeepFM). The Optlnter framework introduces a learnable selector that chooses among naive, memorized, or factorized interaction representations for each feature pair.

Handling large embeddings is addressed from two angles: compression (Double‑Hash, int16 training, DHE) and distributed training. Huawei's ScaleFreeCTR architecture separates embedding storage (CPU‑side with host manager and cache) from MLP computation (GPU‑side), employing hybrid data‑parallel and model‑parallel strategies to efficiently train models with billions of parameters.

Continuous‑feature embedding is explored through the AutoDis method, which combines meta‑embedding, automatic discretization, and learned bucket probabilities to generate expressive embeddings for numeric features, showing consistent gains on public and private CTR datasets.

The talk concludes with three research directions: data‑aware model design, improving training efficiency and data utilization, and increasing automation in data processing, feature selection, and hyper‑parameter tuning to free practitioners for higher‑level innovation.

Deep LearningCTR predictionembeddingrecommendation systemsdistributed trainingfeature interaction
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.