Tagged articles
6 articles
Page 1 of 1
NetEase Smart Enterprise Tech+
NetEase Smart Enterprise Tech+
Aug 4, 2023 · Fundamentals

How to Objectively Quantify Acoustic Echo Cancellation Performance

This article introduces a data‑driven, objective evaluation method for Acoustic Echo Cancellation (AEC), detailing test environments, hardware setups, core metrics, single‑talk and double‑talk scenarios, scoring models, and result analysis to help developers assess and improve AEC algorithms across devices.

Acoustic Echo CancellationRTCaudio signal processing
0 likes · 9 min read
How to Objectively Quantify Acoustic Echo Cancellation Performance
Volcano Engine Developer Services
Volcano Engine Developer Services
Oct 12, 2021 · Artificial Intelligence

How ByteDance’s AI‑Powered Audio Signal Processing Elevates Voice, VR, and VoIP

This article reviews ByteDance’s intelligent audio signal processing technologies, covering foundational algorithms, multimodal audio scaling, sound‑field reconstruction, and high‑quality low‑latency VoIP, and explains how these advances improve audio capture, immersive media, and smart voice interaction across devices.

AR/VR audioVoIPaudio signal processing
0 likes · 13 min read
How ByteDance’s AI‑Powered Audio Signal Processing Elevates Voice, VR, and VoIP
Kuaishou Tech
Kuaishou Tech
Aug 31, 2021 · Artificial Intelligence

Machine Heart Column: Fast Hands MMU's New Dialect Identification Method

Fast Hands MMU and Tsinghua University researchers introduced a novel dynamic multi-scale convolution network for dialect identification, achieving significant performance improvements over state-of-the-art systems.

Dialect Identificationaudio signal processingdynamic convolution
0 likes · 10 min read
Machine Heart Column: Fast Hands MMU's New Dialect Identification Method
NetEase Smart Enterprise Tech+
NetEase Smart Enterprise Tech+
Aug 5, 2021 · Artificial Intelligence

How Real-Time Voice Changing Works: From Tremolo to Gender‑Swap Algorithms

This article explains the demand for fun voice‑changing effects in live streaming and voice chat, introduces common audio effects such as Tremolo, Flanging, and Distortion, and details several real‑time pitch‑preserving algorithms—including OLA, WSOLA, PSOLA, and Phase‑Vocoder—used by NetEase Cloud Communication to deliver high‑quality, privacy‑preserving voice transformations.

Real-time communicationaudio signal processinggender conversion
0 likes · 10 min read
How Real-Time Voice Changing Works: From Tremolo to Gender‑Swap Algorithms
Didi Tech
Didi Tech
Nov 3, 2020 · Artificial Intelligence

Advances in Single‑Channel Speech Separation and Target Speaker Extraction with Iterative Refined Adaptation

The article surveys recent advances in single‑channel speech separation and target‑speaker extraction, explains the encoder‑separator‑decoder framework, compares frequency‑ and time‑domain methods, highlights models such as SpEx+, DPRNN‑Spe, and introduces Iterative Refined Adaptation, which iteratively improves speaker embeddings to boost SI‑SDR performance and enables effective speaker‑suppression for applications like in‑vehicle voice interaction.

AIaudio signal processingdeep learning
0 likes · 13 min read
Advances in Single‑Channel Speech Separation and Target Speaker Extraction with Iterative Refined Adaptation