Tencent Music Multimedia R&D Center Announces Acceptance of Papers on Large-Scale Singer Recognition and Audio Embeddings at IJCNN and ICASSP 2021
Tencent Music’s Multimedia R&D Center celebrated its first appearances at IJCNN and ICASSP 2021 by having two papers accepted—one presenting large‑scale singer recognition via deep metric learning and the other describing user‑driven audio embeddings for content‑based music recommendation—highlighting the team’s expanding expertise across diverse music‑recognition technologies and future research directions.
The annual international conference paper selection results have been announced. Tencent Music Multimedia R&D Center’s paper “Large‑scale singer recognition using deep metric learning: an experimental study” was accepted by the International Joint Conference on Neural Networks (IJCNN), and the paper “Learning Audio Embeddings with User Listening Data for Content‑based Music Recommendation” was accepted by ICASSP 2021. This marks TME’s first participation in both IJCNN and ICASSP, and the work received global expert recognition in the field of music recognition.
Music Recognition Types
Song Identification (Shazam‑like) : Based on audio fingerprinting (Landmark algorithm) and hash‑based retrieval for large‑scale real‑time matching.
Hum Recognition : Uses MIDI extraction and Dynamic Time Warping (DTW) to match user‑hummed melodies, with ongoing exploration of deep‑learning‑based solutions.
Cover Song Identification : Employs deep neural networks to extract melody features that are robust to singer, pitch, and instrument variations; the technology has been patented and is being prepared for paper submission.
Singer Voice Timbre Recognition : An end‑to‑end deep neural network models singer timbre; the results have been accepted at IJCNN.
The team also developed a personalized recommendation algorithm that combines audio content features with user information (User Audio Embedding, UAE), which was accepted at ICASSP.
Paper Publications
“Large‑scale singer recognition using deep metric learning: an experimental study”, IJCNN 2021
“Learning Audio Embeddings with User Listening Data for Content‑based Music Recommendation”, ICASSP 2021
“Phase‑aware music super‑resolution using generative adversarial networks”, INTERSPEECH 2020
Future work will continue to improve traditional audio‑recognition scenarios such as song identification and hum recognition, while also exploring new scenarios like cover song identification and timbre recognition, and integrating audio recognition with lyric recognition and music information retrieval.
Tencent Music Tech Team
Public account of Tencent Music's development team, focusing on technology sharing and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.