Artificial Intelligence 10 min read

User Interest Segmentation and Clustering: Data Science Practices at iQIYI

The article presents iQIYI's data‑science‑driven approach to user interest segmentation, covering the design of weighted interest tags, their validation through blind surveys and AB‑tests, the creation of factual behavior tags, and advanced content‑based clustering methods for more precise audience targeting.

DataFunTalk
DataFunTalk
DataFunTalk
User Interest Segmentation and Clustering: Data Science Practices at iQIYI

Introduction iQIYI Business Intelligence Director Lu Qi, a data‑science expert, shares how the company explores and practices user interest segmentation, explaining the generation of interest tags from factual data, algorithm verification, iteration, and clustering methods.

1. Data Science vs. User Interest Segmentation Data scientists combine strong mathematical foundations, problem‑solving skills, and business communication to extract user interests from large‑scale data. Interest segmentation reflects users' varying affinity for topics and is applicable across e‑commerce, services, and content platforms, enhancing recommendation, advertising, and operational strategies.

2. Generating User‑Interest Weight Tags The process includes: • Defining business‑driven topics (e.g., video genre, stars, channels). • Mapping user behaviors (views, comments, follows) to these topics. • Quantifying behaviors with metrics such as view count and duration. • Applying weighting, decay, and normalization to produce a 0‑1 score that reflects preference intensity.

Tag validation uses two main methods: • Blind‑survey questionnaires to calibrate weights against user feedback. • Online AB‑tests in content filtering or ranking models to observe performance gains and adjust weights accordingly.

Weight tags excel at long‑term preference modeling but have drawbacks: limited interpretability, daily decay complexity, and unsuitability for real‑time scenarios, prompting the use of factual tags for immediate interests.

3. Factual Interest Tags Factual tags capture explicit (e.g., likes, follows) and implicit (e.g., view duration) behaviors, providing a granular view of user preferences.

4. Content‑Based Clustering for Interest Segmentation Instead of demographic grouping, iQIYI clusters users based on content similarity, offering advantages such as discovering hidden associations, supporting data‑driven content planning, and enabling precise audience targeting.

The clustering workflow includes: • Content clustering to identify overlapping audiences. • Hierarchical or similarity‑based clustering with corrections for content volume, channel size, and release timing. • Pruning and granularity control to produce fine‑tuned interest circles.

Q&A Highlights • Business impact of interest tags is measured via AB‑tests and operational outcomes. • User clusters feed directly into recommendation algorithms as signals. • Weight tags indicate preference strength, while content clusters help identify core versus peripheral users. • High‑traffic programs require similarity‑score corrections to avoid bias.

Overall, the talk demonstrates how systematic data‑science methods—weight tagging, factual tagging, and content clustering—enable robust user interest segmentation for improved recommendation, operation, and product strategy.

AB testinguser segmentationData Sciencecontent recommendationtag generationinterest clustering
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.