Artificial Intelligence 19 min read

Exploring QQ Music Recall Algorithms: Knowledge‑Graph Fusion, Sequence & Multi‑Interest Modeling, Audio Recall, and Federated Learning

This article presents a comprehensive overview of QQ Music's recall pipeline, detailing business characteristics, challenges such as noisy user behavior and cold‑start, and four major solutions—including knowledge‑graph‑enhanced recall, sequence‑based and multi‑interest modeling, audio‑based recall, and federated learning—along with practical insights and Q&A.

DataFunSummit

Apr 11, 2022

Exploring QQ Music Recall Algorithms: Knowledge‑Graph Fusion, Sequence & Multi‑Interest Modeling, Audio Recall, and Federated Learning

QQ Music offers a rich set of recommendation products on its homepage, such as personalized radio, daily 30‑song playlists, single‑song recommendations, user‑generated playlists, and AI‑generated playlists, each posing distinct challenges for the recall stage.

Business Overview : The platform serves a broad user base across all ages, with limited explicit user attributes beyond basic demographics. User actions are dominated by full‑play and skip, while other interactions (favorite, blacklist, follow, playlist addition) are less frequent. Music consumption is highly repetitive, and the variety of product formats (audio, lyrics, artist info, playlist metadata) creates diverse optimization goals.

Key Challenges :

High noise in listening behavior makes raw samples unreliable.

Heavy head‑item popularity reduces recommendation surprise.

Sparse user attributes hinder cold‑start.

Solution 1 – Knowledge‑Graph‑Enhanced Recall : By integrating music knowledge‑graph triples (e.g., song‑artist, song‑genre) into a Song2Vec‑style model with a gamma factor, the system improves recall accuracy and reduces bad‑case rates. Graph‑based side‑information (EGES/GraphSAGE) is combined with efficient training to handle QQ Music's massive catalog.

Solution 2 – Sequence & Multi‑Interest Recall :

Sequence modeling uses SASRec with shared item/user embeddings and absolute/relative positional encoding, achieving a 2.5% accuracy lift over a YouTube‑style baseline.

Multi‑interest extraction adopts the MIND architecture (context, multi‑interest extractor via capsule networks, and online serving with multiple interest vectors) and a self‑attention variant, improving hit‑rate and diversity while addressing the “many‑interest” nature of users.

Solution 3 – Audio Recall : Audio features are extracted from 3‑second segments across 14 categories (voice, instrument, genre, etc.) and aggregated (max, min, mean, variance, kurtosis, skewness) to form audio embeddings. These embeddings are used for cold‑start song recall, similarity‑based single‑point recall, and multi‑modal user‑audio embedding, yielding higher surprise and click‑through rates.

Solution 4 – Federated Learning Recall : Vertical federated learning combines QQ Music’s item tower (song attributes) with other business towers (user demographics, interest tags) in a dual‑tower DSSM model. The approach improves cold‑start performance while preserving user privacy, and an MMoE extension enables multi‑task learning across scenarios.

Q&A Highlights :

Recall samples are drawn from global QQ Music data, while ranking samples are tailored per entry point.

Long‑term interests are captured by deep sequence models; short‑term interests by recent interaction‑based pointwise recall.

Multi‑interest quota allocation balances diversity and accuracy.

Audio features are heavily used in ranking models to boost relevance.

Technical stack includes ClickHouse, Superset, Hive, TensorFlow, C++, and Go.

The presentation concludes with thanks and a call for audience engagement.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Recommendation Systems multi‑interest knowledge graph Audio Embedding qq music Sequence Modeling

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.