Artificial Intelligence 10 min read

Optimizing Sparse Feature Embedding for Large‑Scale Recommendation and CTR Prediction

The article reviews recent research on representing massive sparse features in click‑through‑rate (CTR) models, introducing Alibaba's Res‑embedding method and Google's Neural Input Search (NIS) approach, and discusses how these techniques improve embedding efficiency and model generalization in large‑scale recommendation systems.

DataFunTalk

May 15, 2020

Optimizing Sparse Feature Embedding for Large‑Scale Recommendation and CTR Prediction

CTR prediction and recommendation tasks involve massive sparse features, where many item IDs appear infrequently despite a huge overall feature set. Effective embedding of these sparse features is crucial for model performance.

1. Item Embedding in User Behavior Sequences – The Res‑embedding work (DLP‑KDD 2019) proves that the generalization error of a DNN CTR model correlates with the distribution of item embeddings: tighter clusters of interest‑related items lead to lower error. It proposes representing each item embedding as the sum of a shared Central Embedding for a user‑interest cluster and a Residual Embedding specific to the item:

Item Embedding = Central Embedding + Residual embedding

Constraining the residual’s magnitude keeps interest clusters compact, improving generalization. The paper also outlines three graph‑based methods (including a GNN) to assign items to interest clusters.

2. Feature Embedding for Non‑Behavioral Recommendation Tasks – Google’s Neural Input Search (NIS) tackles the allocation of embedding dimensions to features of varying frequencies. It partitions the two‑dimensional space of feature count vs. embedding size into blocks, forming a search space of possible allocation schemes. Using ENAS (Efficient Neural Architecture Search), NIS searches for policies that assign longer embeddings to high‑frequency, informative features while giving shorter or shared embeddings to low‑frequency ones. The reinforcement‑learning reward balances validation AUC improvement against total embedding memory usage.

Both Res‑embedding and NIS aim to reduce over‑parameterization and enhance model generalization; they can be combined for even better sparse feature representation in large‑scale DNN recommendation systems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

deep learning CTR prediction Recommendation Systems reinforcement learning sparse features feature embedding

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.