Tagged articles

6 articles

Page 1 of 1

May 18, 2026 · Artificial Intelligence

How Volcano Engine and CAS Acoustic Institute Won Top Spots at the First Low‑Resource Audio Codec Challenge

Volcano Engine's audio team, together with the Chinese Academy of Sciences Acoustic Institute, secured first‑place, runner‑up, and third‑place finishes in the 2025 Low‑Resource Audio Codec Challenge at ICASSP 2026 by delivering AI‑driven codecs that balance ultra‑low bitrate, low complexity, and high audio quality for real‑time communication and streaming scenarios.

AI codecICASSPVolcano Engine

0 likes · 12 min read

How Volcano Engine and CAS Acoustic Institute Won Top Spots at the First Low‑Resource Audio Codec Challenge

Amap Tech

Apr 21, 2025 · Artificial Intelligence

Lenna: Language‑Enhanced Reasoning Detection Assistant and a Chain‑of‑Thought Image Editing Framework Using Multimodal Large Language Models

At ICASSP 2025, Gaode’s two accepted papers present Lenna, a language‑enhanced reasoning detection assistant that adds a DET token to multimodal LLMs and achieves state‑of‑the‑art accuracy on RefCOCO benchmarks, and a chain‑of‑thought image‑editing framework that converts complex prompts into segmented masks and repair prompts for diffusion‑based inpainting, surpassing existing methods.

AIICASSPchain-of-thought

0 likes · 10 min read

Lenna: Language‑Enhanced Reasoning Detection Assistant and a Chain‑of‑Thought Image Editing Framework Using Multimodal Large Language Models

Kuaishou Audio & Video Technology

Jan 27, 2022 · Artificial Intelligence

How the L3DAS22 Challenge Advances Deep Learning for 3D Audio Signal Processing

The inaugural L3DAS22 competition, co‑hosted by Kuaishou Audio and Sapienza University, gathered nearly 50 academic and industry teams to benchmark deep‑learning‑based 3D audio signal processing, featuring tasks on multi‑channel speech enhancement and source detection, with results presented at ICASSP 2022.

3D audioICASSPaudio challenge

0 likes · 5 min read

How the L3DAS22 Challenge Advances Deep Learning for 3D Audio Signal Processing

Tencent Music Tech Team

Apr 26, 2021 · Artificial Intelligence

Tencent Music Multimedia R&D Center Announces Acceptance of Papers on Large-Scale Singer Recognition and Audio Embeddings at IJCNN and ICASSP 2021

Tencent Music’s Multimedia R&D Center celebrated its first appearances at IJCNN and ICASSP 2021 by having two papers accepted—one presenting large‑scale singer recognition via deep metric learning and the other describing user‑driven audio embeddings for content‑based music recommendation—highlighting the team’s expanding expertise across diverse music‑recognition technologies and future research directions.

Audio EmbeddingICASSPIJCNN

0 likes · 8 min read

Tencent Music Multimedia R&D Center Announces Acceptance of Papers on Large-Scale Singer Recognition and Audio Embeddings at IJCNN and ICASSP 2021

iQIYI Technical Product Team

Nov 20, 2020 · Artificial Intelligence

iQIYI M2VoC Multi‑Speaker Multi‑Style Voice Cloning Challenge (ICASSP 2021) Overview

The iQIYI M2VoC Challenge at ICASSP 2021 invites researchers to tackle low‑resource multi‑speaker, multi‑style voice cloning by providing Mandarin datasets, few‑shot and extremely few‑shot tracks with strict data rules, MOS‑based subjective evaluation, and a $9,600 prize pool for top submissions.

AIICASSPchallenge

0 likes · 10 min read

iQIYI M2VoC Multi‑Speaker Multi‑Style Voice Cloning Challenge (ICASSP 2021) Overview

Alibaba Cloud Developer

Jun 20, 2019 · Artificial Intelligence

Unlock Cutting-Edge Voice AI: Highlights from Alibaba’s Speech & Signal Processing eBook

This article introduces Alibaba's new e‑book collection of five ICASSP‑accepted papers that showcase advances in speech recognition, synthesis, and emotion detection, detailing novel models like DFSMN, A‑LSTM, and speaker‑adaptation techniques that dramatically improve speed, size, and accuracy.

AI voiceEmotion RecognitionICASSP

0 likes · 6 min read

Unlock Cutting-Edge Voice AI: Highlights from Alibaba’s Speech & Signal Processing eBook