Artificial Intelligence 7 min read

Video Recommendation System: Framework, Topic Clustering, and Related Video Retrieval

The paper proposes a video recommendation framework that combines recall and ranking modules, using a multi‑modal topic clustering approach—integrating audio, visual, and textual features via NeXtVLAD, PCA, and K‑Means—to generate unified video representations, improve candidate selection, and boost click‑through and viewing time, while addressing cold‑start and semantic relevance challenges.

NetEase Media Technology Team

Apr 4, 2019

Video Recommendation System: Framework, Topic Clustering, and Related Video Retrieval

This paper presents a comprehensive video recommendation system framework, focusing on topic clustering and related video retrieval. The system architecture consists of two core modules: recall and ranking. The recall module filters hundreds of millions of data to generate relevant candidates, while the ranking module performs precise sorting based on user profiles and contextual features.

The paper discusses traditional recall methods including behavior recall (collaborative filtering), semantic recall (using metadata like titles, tags, and categories), and visual recall (based on video content similarity). Each method has limitations, particularly in handling cold-start problems where new users or content lack sufficient interaction data.

The proposed solution introduces a multi-modal topic clustering approach that combines audio, video frame features, and text information. The system uses NeXtVLAD algorithm to aggregate frame-level features from different modalities, creating unified video-level representations. The model incorporates video titles through word vectorization and training on NetEase news data, enhancing semantic understanding.

The clustering process involves PCA dimensionality reduction followed by K-Means clustering, resulting in 3600+ topic categories covering 96% of videos in the recommendation pool. Related video retrieval is implemented by grouping videos within the same topic cluster, demonstrated through A/B testing showing improvements in click-through rates and viewing duration.

The paper identifies current challenges including visual similarity without semantic relevance and overly broad topic coverage. Future work includes incorporating more semantic information and implementing hierarchical clustering for finer topic granularity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Hierarchical Clustering A/B testing collaborative filtering cold-start problem video recommendation content-based filtering topic clustering multi-modal learning NeXtVLAD

Written by

NetEase Media Technology Team

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.