Real-time Attention-based Look-alike Model (RALM) for Recommender Systems
The Real‑time Attention‑based Look‑alike Model (RALM) converts recommendation to a user‑user problem by representing items with aggregated seed‑user embeddings, employs shared projection, local and global attention towers, and enables instant, diverse, high‑CTR recommendations without retraining, as demonstrated by its deployment in WeChat “Look‑at”.
Look‑alike is a classic recommendation algorithm in the advertising domain, known for strong targeting and precise user expansion. This article presents a reformulated version of look‑alike, called Real-time Attention‑based Look-alike Model (RALM), designed for the high‑timeliness news recommendation scenario of WeChat "Look‑at" and accepted at KDD 2019.
In the "Look‑at" feed, millions of news, video and article items are distributed daily via personalized recommendation. Long‑tail items (e.g., niche articles, operational topics) suffer from low exposure because traditional models rely heavily on item historical behavior, leading to a Matthew effect where popular items dominate.
Key requirements for a suitable model are: (1) real‑time user expansion without model retraining, (2) accurate and diverse recommendations, and (3) support for online inference.
RALM addresses these needs by replacing item historical features with a set of seed‑user embeddings, effectively converting the problem from a user‑item model to a user‑user model. An item is represented as the aggregated representation of its seed users: I = f({E(u1), E(u2), ..., E(un)}) , where E(ui) is the embedding of seed user ui .
The model consists of two towers:
Seeds tower : takes seed‑user embeddings, applies a fully‑connected projection (shared with the target tower), followed by a self‑attention unit and a productive‑attention unit, then pools to produce a seed representation.
Target tower : processes the target user embedding with the same projection and directly computes cosine similarity with the seed representation.
Local attention activates the most relevant seed users for a given target user, mitigating the averaging effect of simple pooling. The attention formula is:
score = (W_l * E_s) ⊙ (W_l * E_u) + (W_l * E_local)
To reduce computation when the number of seeds n is large, seeds are clustered with K‑means (k≈20) and attention is computed on cluster centroids.
Global attention captures common group‑level information via a self‑attention mechanism:
global_score = (W_g * E_global) ⊙ (W_g * E_u)
The final similarity is a weighted sum of local and global cosine scores.
Loss : RALM is formulated as a multi‑class classification problem (which seed‑user group the target belongs to). Negative sampling provides multiple positive and negative examples, and the loss is the sum of cross‑entropy terms.
The RALM framework has been deployed in WeChat "Look‑at". The offline stage trains the user‑representation model (a modified YouTube‑DNN) and the look‑alike model, producing user embeddings E(u) and model parameters that are stored in a KV cache. Online asynchronous processing updates seed‑user sets and pre‑computes local attention vectors for newly arriving items, using periodic K‑means updates.
During online serving, the system fetches the requesting user’s embedding, iterates over candidate items, retrieves their seed‑user embeddings, applies the pre‑computed local/global attention, and produces a similarity score used as a ranking feature or threshold for exposure control.
Additional details include:
Feature design: items are represented solely by seed‑user embeddings, avoiding sparse item‑level features.
Model tuning: shared fully‑connected layers and dropout reduce over‑fitting from the simple look‑alike architecture.
Cold‑start handling: for brand‑new items with no seeds, a lightweight MLP using semantic and user‑profile features predicts CTR to gather initial seeds before switching to RALM.
In offline evaluation and online A/B tests, RALM improves exposure, click‑through rate, and diversity compared with baseline look‑alike and traditional recommendation models.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.