Artificial Intelligence 16 min read

An Overview of Twitter’s Open‑Source Recommendation System Architecture

Twitter’s recently open‑sourced recommendation system is dissected, covering its overall architecture, graph‑based data and feature engineering, recall pipelines (in‑network and out‑of‑network), coarse and fine ranking models, mixing and re‑ranking stages, as well as the supporting infrastructure and code examples.

Architect

Jun 10, 2023

Twitter has recently open‑sourced the majority of its recommendation system code, which has attracted over 36k stars on GitHub. This article provides a systematic translation and explanation of the publicly available material, focusing on the system’s architecture, data processing, and ranking pipeline.

Problem Definition : The input is the massive Twitter social graph composed of tweets, users, and interaction edges; the output is the probability that a user will interact with a tweet or another user, enabling tweet and user recommendation.

Data : The core asset is a heterogeneous social graph that includes user‑tweet interactions, user profiles, and other signals.

Feature Engineering : Twitter emphasizes graph pre‑training, clustering, and community detection. Graph‑based embeddings (e.g., TwHIN) are used for vector‑based recall and ranking, alongside a small set of safety‑related features.

Core Recommendation Service : The Home Mixer, a custom Scala framework, orchestrates three main stages: recall, coarse ranking (light ranker), and mixing/re‑ranking.

Recall

Twitter employs multiple recall sources, delivering hundreds of millions of candidate tweets. Two primary recall channels are:

In‑Network Recall : Uses the Earlybird search engine (a large inverted index) to retrieve recent tweets from accounts a user follows, contributing roughly 50% of the final recommendations.

Out‑of‑Network Recall : Includes collaborative‑filtering via the UserTweetEntityGraph (UTEG) powered by the GraphJet engine, and embedding‑based retrieval using sparse (SimClusters) and dense (TwHIN) representations.

Both channels feed their results into a light‑ranker for initial filtering before proceeding to coarse ranking.

Coarse Ranking (Light Ranker)

The light ranker is a logistic‑regression model that scores candidates using user‑side features (e.g., pagerank‑based reputation, follower count), tweet‑side features (text quality, real‑time engagement metrics), and contextual features (language). Training employs weighted loss where different interaction types receive different weights.

INDEX_BY_LABEL = {
  "is_clicked": 1,
  "is_favorited": 2,
  "is_open_linked": 3,
  "is_photo_expanded": 4,
  "is_profile_clicked": 5,
  "is_replied": 6,
  "is_retweeted": 7,
  "is_video_playback_50": 8
}

The model also incorporates RealGraph, a directed weighted graph of user‑user interactions, to predict link‑level interaction probabilities.

Fine Ranking (Heavy Ranker)

The heavy ranker is a multi‑objective neural network (≈48 M parameters) trained on millions of interaction signals (likes, retweets, replies, video playback, etc.). It outputs ten engagement probabilities, which are linearly combined for final scoring.

"recap.engagement.is_favorited": 0.5
"recap.engagement.is_good_clicked_convo_desc_favorited_or_replied": 11* (max of two good‑click features)
"recap.engagement.is_good_clicked_convo_desc_v2": 11*
"recap.engagement.is_negative_feedback_v2": -74
"recap.engagement.is_profile_clicked_and_profile_engaged": 12
"recap.engagement.is_replied": 27
"recap.engagement.is_replied_reply_engaged_by_author": 75
"recap.engagement.is_report_tweet_clicked": -369
"recap.engagement.is_retweeted": 1
"recap.engagement.is_video_playback_50": 0.005

These scores are summed with simple weights to produce the final ranking.

Mixing and Re‑ranking

The mixing layer (CR‑Mixer) merges multiple recall streams, applies heuristic rules for content safety, author diversity, and fatigue mitigation, and interleaves organic tweets with ads and onboarding prompts. The re‑ranking stage runs on the Homepage Mixer, handling visibility filters, diversity constraints, and content freshness.

Infrastructure

Key infrastructure components include:

Navi : A high‑performance model‑serving system written in Rust.

Product‑Mixer : The feed‑generation framework.

twml : The legacy TensorFlow‑1 based training platform used for the light ranker.

These services enable the end‑to‑end pipeline to process roughly 5 billion requests per day with an average latency under 1.5 seconds.

Transparency Initiative

Twitter’s open‑source release aims to increase transparency while removing code that could be abused for security or privacy attacks. The community is invited to submit issues and pull requests, and security bugs can be reported via HackerOne.

From a technical perspective, the open‑sourced code provides a concrete example of a large‑scale recommendation system that can be studied by researchers and engineers alike.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Machine Learning recommendation system Twitter graph embedding Ranking Models

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.