NetEase Media Technology Team
Apr 11, 2022 · Artificial Intelligence
Multimodal Video Tagging: Challenges and a Two‑Stage Recall‑Ranking Solution
To tackle the massive, multimodal tagging challenge of short‑video platforms—characterized by a huge long‑tail tag set, sparse annotations, and uneven modality contributions—the authors propose a two‑stage recall‑ranking system that first retrieves candidates via text, visual, audio and classification cues, then refines them with contrastive learning and extensive hard‑negative sampling, achieving 0.884 tag accuracy in a real‑world news video recommender.
Deep LearningRecommendation systemsembedding
0 likes · 12 min read