Artificial Intelligence 8 min read

Tencent Music Multimedia R&D Center Announces Acceptance of Papers on Large-Scale Singer Recognition and Audio Embeddings at IJCNN and ICASSP 2021

Tencent Music’s Multimedia R&D Center celebrated its first appearances at IJCNN and ICASSP 2021 by having two papers accepted—one presenting large‑scale singer recognition via deep metric learning and the other describing user‑driven audio embeddings for content‑based music recommendation—highlighting the team’s expanding expertise across diverse music‑recognition technologies and future research directions.

Tencent Music Tech Team
Tencent Music Tech Team
Tencent Music Tech Team
Tencent Music Multimedia R&D Center Announces Acceptance of Papers on Large-Scale Singer Recognition and Audio Embeddings at IJCNN and ICASSP 2021

The annual international conference paper selection results have been announced. Tencent Music Multimedia R&D Center’s paper “Large‑scale singer recognition using deep metric learning: an experimental study” was accepted by the International Joint Conference on Neural Networks (IJCNN), and the paper “Learning Audio Embeddings with User Listening Data for Content‑based Music Recommendation” was accepted by ICASSP 2021. This marks TME’s first participation in both IJCNN and ICASSP, and the work received global expert recognition in the field of music recognition.

Music Recognition Types

Song Identification (Shazam‑like) : Based on audio fingerprinting (Landmark algorithm) and hash‑based retrieval for large‑scale real‑time matching.

Hum Recognition : Uses MIDI extraction and Dynamic Time Warping (DTW) to match user‑hummed melodies, with ongoing exploration of deep‑learning‑based solutions.

Cover Song Identification : Employs deep neural networks to extract melody features that are robust to singer, pitch, and instrument variations; the technology has been patented and is being prepared for paper submission.

Singer Voice Timbre Recognition : An end‑to‑end deep neural network models singer timbre; the results have been accepted at IJCNN.

The team also developed a personalized recommendation algorithm that combines audio content features with user information (User Audio Embedding, UAE), which was accepted at ICASSP.

Paper Publications

“Large‑scale singer recognition using deep metric learning: an experimental study”, IJCNN 2021

“Learning Audio Embeddings with User Listening Data for Content‑based Music Recommendation”, ICASSP 2021

“Phase‑aware music super‑resolution using generative adversarial networks”, INTERSPEECH 2020

Future work will continue to improve traditional audio‑recognition scenarios such as song identification and hum recognition, while also exploring new scenarios like cover song identification and timbre recognition, and integrating audio recognition with lyric recognition and music information retrieval.

deep learningAudio EmbeddingICASSPIJCNNMusic RecognitionSinger Identification
Tencent Music Tech Team
Written by

Tencent Music Tech Team

Public account of Tencent Music's development team, focusing on technology sharing and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.