An Overview of Twitter’s Open‑Source Recommendation System Architecture
Twitter’s recently open‑sourced recommendation system is dissected, covering its overall architecture, graph‑based data and feature engineering, recall pipelines (in‑network and out‑of‑network), coarse and fine ranking models, mixing and re‑ranking stages, as well as the supporting infrastructure and code examples.
Twitter has recently open‑sourced the majority of its recommendation system code, which has attracted over 36k stars on GitHub. This article provides a systematic translation and explanation of the publicly available material, focusing on the system’s architecture, data processing, and ranking pipeline.
Problem Definition : The input is the massive Twitter social graph composed of tweets, users, and interaction edges; the output is the probability that a user will interact with a tweet or another user, enabling tweet and user recommendation.
Data : The core asset is a heterogeneous social graph that includes user‑tweet interactions, user profiles, and other signals.
Feature Engineering : Twitter emphasizes graph pre‑training, clustering, and community detection. Graph‑based embeddings (e.g., TwHIN) are used for vector‑based recall and ranking, alongside a small set of safety‑related features.
Core Recommendation Service : The Home Mixer, a custom Scala framework, orchestrates three main stages: recall, coarse ranking (light ranker), and mixing/re‑ranking.
Recall
Twitter employs multiple recall sources, delivering hundreds of millions of candidate tweets. Two primary recall channels are:
In‑Network Recall : Uses the Earlybird search engine (a large inverted index) to retrieve recent tweets from accounts a user follows, contributing roughly 50% of the final recommendations.
Out‑of‑Network Recall : Includes collaborative‑filtering via the UserTweetEntityGraph (UTEG) powered by the GraphJet engine, and embedding‑based retrieval using sparse (SimClusters) and dense (TwHIN) representations.
Both channels feed their results into a light‑ranker for initial filtering before proceeding to coarse ranking.
Coarse Ranking (Light Ranker)
The light ranker is a logistic‑regression model that scores candidates using user‑side features (e.g., pagerank‑based reputation, follower count), tweet‑side features (text quality, real‑time engagement metrics), and contextual features (language). Training employs weighted loss where different interaction types receive different weights.
INDEX_BY_LABEL = {
"is_clicked": 1,
"is_favorited": 2,
"is_open_linked": 3,
"is_photo_expanded": 4,
"is_profile_clicked": 5,
"is_replied": 6,
"is_retweeted": 7,
"is_video_playback_50": 8
}The model also incorporates RealGraph, a directed weighted graph of user‑user interactions, to predict link‑level interaction probabilities.
Fine Ranking (Heavy Ranker)
The heavy ranker is a multi‑objective neural network (≈48 M parameters) trained on millions of interaction signals (likes, retweets, replies, video playback, etc.). It outputs ten engagement probabilities, which are linearly combined for final scoring.
"recap.engagement.is_favorited": 0.5
"recap.engagement.is_good_clicked_convo_desc_favorited_or_replied": 11* (max of two good‑click features)
"recap.engagement.is_good_clicked_convo_desc_v2": 11*
"recap.engagement.is_negative_feedback_v2": -74
"recap.engagement.is_profile_clicked_and_profile_engaged": 12
"recap.engagement.is_replied": 27
"recap.engagement.is_replied_reply_engaged_by_author": 75
"recap.engagement.is_report_tweet_clicked": -369
"recap.engagement.is_retweeted": 1
"recap.engagement.is_video_playback_50": 0.005These scores are summed with simple weights to produce the final ranking.
Mixing and Re‑ranking
The mixing layer (CR‑Mixer) merges multiple recall streams, applies heuristic rules for content safety, author diversity, and fatigue mitigation, and interleaves organic tweets with ads and onboarding prompts. The re‑ranking stage runs on the Homepage Mixer, handling visibility filters, diversity constraints, and content freshness.
Infrastructure
Key infrastructure components include:
Navi : A high‑performance model‑serving system written in Rust.
Product‑Mixer : The feed‑generation framework.
twml : The legacy TensorFlow‑1 based training platform used for the light ranker.
These services enable the end‑to‑end pipeline to process roughly 5 billion requests per day with an average latency under 1.5 seconds.
Transparency Initiative
Twitter’s open‑source release aims to increase transparency while removing code that could be abused for security or privacy attacks. The community is invited to submit issues and pull requests, and security bugs can be reported via HackerOne.
From a technical perspective, the open‑sourced code provides a concrete example of a large‑scale recommendation system that can be studied by researchers and engineers alike.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.