Tag

ICASSP

0 views collected around this technical thread.

Amap Tech
Amap Tech
Apr 21, 2025 · Artificial Intelligence

Lenna: Language‑Enhanced Reasoning Detection Assistant and a Chain‑of‑Thought Image Editing Framework Using Multimodal Large Language Models

At ICASSP 2025, Gaode’s two accepted papers present Lenna, a language‑enhanced reasoning detection assistant that adds a DET token to multimodal LLMs and achieves state‑of‑the‑art accuracy on RefCOCO benchmarks, and a chain‑of‑thought image‑editing framework that converts complex prompts into segmented masks and repair prompts for diffusion‑based inpainting, surpassing existing methods.

AIChain-of-ThoughtICASSP
0 likes · 10 min read
Lenna: Language‑Enhanced Reasoning Detection Assistant and a Chain‑of‑Thought Image Editing Framework Using Multimodal Large Language Models
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Jan 27, 2022 · Artificial Intelligence

How the L3DAS22 Challenge Advances Deep Learning for 3D Audio Signal Processing

The inaugural L3DAS22 competition, co‑hosted by Kuaishou Audio and Sapienza University, gathered nearly 50 academic and industry teams to benchmark deep‑learning‑based 3D audio signal processing, featuring tasks on multi‑channel speech enhancement and source detection, with results presented at ICASSP 2022.

3D audioDeep LearningICASSP
0 likes · 5 min read
How the L3DAS22 Challenge Advances Deep Learning for 3D Audio Signal Processing
Tencent Music Tech Team
Tencent Music Tech Team
Apr 26, 2021 · Artificial Intelligence

Tencent Music Multimedia R&D Center Announces Acceptance of Papers on Large-Scale Singer Recognition and Audio Embeddings at IJCNN and ICASSP 2021

Tencent Music’s Multimedia R&D Center celebrated its first appearances at IJCNN and ICASSP 2021 by having two papers accepted—one presenting large‑scale singer recognition via deep metric learning and the other describing user‑driven audio embeddings for content‑based music recommendation—highlighting the team’s expanding expertise across diverse music‑recognition technologies and future research directions.

Audio EmbeddingDeep LearningICASSP
0 likes · 8 min read
Tencent Music Multimedia R&D Center Announces Acceptance of Papers on Large-Scale Singer Recognition and Audio Embeddings at IJCNN and ICASSP 2021
iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 20, 2020 · Artificial Intelligence

iQIYI M2VoC Multi‑Speaker Multi‑Style Voice Cloning Challenge (ICASSP 2021) Overview

The iQIYI M2VoC Challenge at ICASSP 2021 invites researchers to tackle low‑resource multi‑speaker, multi‑style voice cloning by providing Mandarin datasets, few‑shot and extremely few‑shot tracks with strict data rules, MOS‑based subjective evaluation, and a $9,600 prize pool for top submissions.

AIChallengeICASSP
0 likes · 10 min read
iQIYI M2VoC Multi‑Speaker Multi‑Style Voice Cloning Challenge (ICASSP 2021) Overview