Tagged articles
3 articles
Page 1 of 1
Weekly Large Model Application
Weekly Large Model Application
Mar 23, 2026 · Artificial Intelligence

Inside Step‑Audio2: End‑to‑End Multimodal Audio LLM Architecture and Deployment

This article dissects Step‑Audio2, an industrial‑grade multimodal large language model that unifies speech understanding, translation, dialogue and audio generation in a single causal LM, detailing its inference pipeline, key implementation tricks, deployment modes, strengths, limitations, and suitable application scenarios.

PythonStep-Audio2Token2Wav
0 likes · 10 min read
Inside Step‑Audio2: End‑to‑End Multimodal Audio LLM Architecture and Deployment
Xiaomi Tech
Xiaomi Tech
Mar 18, 2026 · Artificial Intelligence

Xiaomi’s MiMo‑V2‑Omni: A Full‑Modal Agent Base that Sees, Listens, and Acts

Xiaomi unveiled MiMo‑V2‑Omni, a full‑modal agent base that unifies text, image, video and audio perception with tool‑calling and GUI actions, outperforming leading models such as Gemini 3 Pro and Claude Opus 4.6 on benchmarks, and offering a 256K‑context API for diverse real‑world tasks.

APIAgent AIMiMo-V2-Omni
0 likes · 8 min read
Xiaomi’s MiMo‑V2‑Omni: A Full‑Modal Agent Base that Sees, Listens, and Acts
Xiaomi Tech
Xiaomi Tech
Jan 21, 2026 · Artificial Intelligence

Xiaomi’s AI Breakthroughs Earn Spot at ICASSP 2026

Xiaomi announced that a suite of AI research papers—including a large‑scale audio‑text dataset, a federated learning framework for domain and class generalization, a dual‑encoder music evaluation model, a cross‑domain audio‑text pre‑training system, a one‑step video‑to‑audio synthesis method, a training‑free frame‑selection technique for long‑video understanding, and a unified multimodal retrieval architecture—were accepted to the prestigious ICASSP 2026 conference, showcasing detailed methodologies, benchmark results, and potential impact across audio, vision, and multimodal AI applications.

AIICASSP 2026Multimodal
0 likes · 14 min read
Xiaomi’s AI Breakthroughs Earn Spot at ICASSP 2026