Artificial Intelligence 15 min read

Search Term Recommendation: Scenarios, Algorithm Design, Challenges and Future Directions

This article presents an in‑depth overview of search term recommendation in QQ Browser, covering the various recommendation scenarios, the composition of recommendation items, the multi‑stage algorithm architecture, key technical challenges, evaluation metrics, and future research directions such as multi‑task and session‑aware modeling.

DataFunTalk
DataFunTalk
DataFunTalk
Search Term Recommendation: Scenarios, Algorithm Design, Challenges and Future Directions

In this talk, the speaker from Tencent explains the thinking behind search term recommendation in the QQ Browser, aiming to provide insights for practitioners.

Recommendation Scenarios : The presentation introduces several key scenarios, including personalized term recommendation based on user interests, query auto‑completion, and recommendations derived from the user's current short video or article context, as well as related terms displayed on the search results page.

Three‑Layer Matching Model : Unlike traditional news or short‑video recommendation, search term recommendation involves three layers of connections – (1) context to recommended query, (2) query to search results, and (3) results page to final content consumption – which increases technical difficulty.

Key Challenges :

Evaluability – search provides strong expert knowledge for assessing result relevance, unlike the “one‑size‑fits‑all” nature of traditional recommendation.

Content Ecosystem – search terms are billions of user‑generated short texts without a stable author ecosystem, requiring extensive query understanding.

Item Attribute Volatility – query relevance changes rapidly over time, demanding models that handle fast‑changing signals.

Query Library Architecture : The query pool consists of four categories – (1) active search terms (billions of items), (2) generative queries (template‑based and extraction‑based), (3) knowledge‑graph generated queries, and (4) manually operated hot‑search terms. Various ML operators evaluate query quality and integrate safety and human‑review mechanisms.

Result‑Page Satisfaction Metrics : Four dimensions are used to evaluate a query’s result page – relevance, richness (multiple media types), timeliness, and content quality.

Algorithm Stack : The recommendation pipeline follows a classic stack – query library → index → recall → coarse ranking → fine ranking → mixed ranking → business integration. Specific challenges and solutions are discussed for the coarse‑ranking stage (data sparsity, teacher‑student learning, embedding table compression) and the fine‑ranking stage (multi‑task learning, ESMM for CTR‑CVR dependency, MMoE for correlated objectives, lambda‑loss for position‑aware pairwise ranking).

Multi‑Task Learning Design : The fine‑ranking model jointly predicts click‑through rate, result‑page click‑through, consumption count, total consumption, and query‑article relevance, using architectures such as ESMM, MMoE, and PLE.

Future Outlook :

Incorporate multi‑task and session/sequence modeling (e.g., transformer‑based approaches) to capture diverse user behaviors across the QQ Browser ecosystem.

Improve user satisfaction by enhancing query understanding, real‑time signal integration, and continuous quality monitoring.

Q&A Highlights :

Embedding IDs are learned end‑to‑end, with pre‑training experiments ongoing.

Timeliness is addressed by dedicated teams that detect bursts, update signals, and feed them into ranking models.

Active search queries dominate the pool; generative queries are filtered for syntactic and semantic completeness.

The talk concludes with thanks and a reminder to like, share, and follow the DataFunTalk community.

Machine LearningRankingevaluation metricsmulti-task learningfuture researchsearch recommendationquery suggestion
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.