KeSpeech: A Large-Scale Chinese Mandarin Dialect Speech Benchmark Presented at NeurIPS 2021
KeSpeech, a benchmark jointly released by Beike AI and Tsinghua University at NeurIPS 2021, provides a massive Chinese Mandarin dialect dataset covering 30,000 speakers from 34 cities, supporting speech recognition, speaker verification, dialect identification, and voice conversion tasks, and includes rich multi‑scenario and parallel corpora for advanced research.
At the recent NeurIPS 2021 conference, Beike AI partnered with Tsinghua University to launch KeSpeech, a large‑scale benchmark system for Chinese Mandarin dialect speech aimed at preserving dialect culture and advancing speech technology research.
The benchmark comprises nearly 30,000 speakers from 34 cities, covering eight major Mandarin sub‑dialects (Beijing, Jianghuai, Jiaoliao, Jilu, Lanyin, Northeastern, Southwestern, Central Plains) and supports multiple tasks such as automatic speech recognition, speaker verification, dialect classification, and voice conversion.
Key characteristics include multi‑label annotations (text, speaker info, dialect), diverse recording scenarios and channels, parallel corpora where the same speakers record both standard Mandarin and dialects, a large speaker pool for industrial‑grade speaker verification, and recordings captured across two sessions spaced at least two weeks apart to enable studies of temporal variability.
KeSpeech defines standard training, development, and test splits for three core tasks—speech recognition, speaker verification, and dialect identification—providing a common reference for evaluating new methods. The benchmark also includes extended experiments on dialect‑style voice conversion.
Demo resources are available at https://wen2cheng.github.io , and the associated paper can be accessed via https://openreview.net/pdf?id=b3Zoeq2sCLq .
In summary, KeSpeech’s extensive coverage of regions, dialects, demographics, and recording conditions offers significant research value for robust speech recognition, speaker verification, multi‑task learning, and sociolinguistic studies, and is expected to stimulate further advances in AI‑driven speech technologies.
Beike Product & Technology
As Beike's official product and technology account, we are committed to building a platform for sharing Beike's product and technology insights, targeting internet/O2O developers and product professionals. We share high-quality original articles, tech salon events, and recruitment information weekly. Welcome to follow us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.