Tag

speaker diarization

0 views collected around this technical thread.

iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 7, 2024 · Artificial Intelligence

Multimodal Speaker Diarization for Long-Form Video Scripts

iQIYI’s multimodal speaker diarization system splits long‑form video using subtitle timestamps and scene detection, extracts voiceprints with a custom model, hierarchically clusters them, and applies an Activate Speaker Detection algorithm combined with face‑recognition to assign speakers, achieving around 90 % precision and recall and boosting downstream tasks such as summarization, translation, and dubbing.

dialogue recognitioniQIYImultimodal AI
0 likes · 8 min read
Multimodal Speaker Diarization for Long-Form Video Scripts
58 Tech
58 Tech
Aug 7, 2020 · Artificial Intelligence

Technical Overview of 58.com Intelligent Voice Analysis Platform

The article presents a comprehensive technical overview of 58.com’s intelligent voice analysis platform, detailing its business background, system architecture, speech and NLP technologies, speaker diarization methods, model performance, data labeling workflow, and practical applications in call‑center quality inspection and user profiling.

AI PlatformNatural Language ProcessingSpeech Recognition
0 likes · 11 min read
Technical Overview of 58.com Intelligent Voice Analysis Platform