Tagged articles
4 articles
Page 1 of 1
ByteDance SE Lab
ByteDance SE Lab
May 18, 2026 · Artificial Intelligence

How Volcano Engine and CAS Acoustic Institute Won Top Spots at the First Low‑Resource Audio Codec Challenge

Volcano Engine's audio team, together with the Chinese Academy of Sciences Acoustic Institute, secured first‑place, runner‑up, and third‑place finishes in the 2025 Low‑Resource Audio Codec Challenge at ICASSP 2026 by delivering AI‑driven codecs that balance ultra‑low bitrate, low complexity, and high audio quality for real‑time communication and streaming scenarios.

AI codecICASSPVolcano Engine
0 likes · 12 min read
How Volcano Engine and CAS Acoustic Institute Won Top Spots at the First Low‑Resource Audio Codec Challenge
Weekly Large Model Application
Weekly Large Model Application
May 1, 2026 · Artificial Intelligence

How Speech Models Turn Waveforms into Computable Tokens

The article explains why speech tokenization is essential for large audio models, outlines three core challenges, compares five major tokenization paradigms—including neural codecs with vector quantization, self‑supervised learning with clustering, continuous embeddings, ASR‑derived text tokens, and hierarchical multi‑codebook tokens—and provides practical guidance for selecting the right approach based on task requirements and trade‑offs.

audio codechierarchical tokensself-supervised learning
0 likes · 11 min read
How Speech Models Turn Waveforms into Computable Tokens
58 Tech
58 Tech
May 28, 2019 · Artificial Intelligence

Implementation of Voice Call Functionality in an Intelligent Voice Robot

This article details the architecture and implementation of the voice call module of an intelligent voice robot, covering SIP signaling establishment, RTP session handling, audio encoding/decoding, sampling, and packetization to enable automated outbound calls and multi‑round voice interactions.

AISIPTelephony
0 likes · 9 min read
Implementation of Voice Call Functionality in an Intelligent Voice Robot