Tagged articles
7 articles
Page 1 of 1
SuanNi
SuanNi
May 28, 2026 · Artificial Intelligence

How a 3.8B Model Beats 6B+ Models Using Just 20% of the Compute – Inside Microsoft Lens

Microsoft’s Lens team shows that a 3.8 B‑parameter image‑generation model can match or surpass 6 B‑plus models while consuming only about 19 % of the GPU compute, thanks to aggressive model compression, dense captioning, mixed‑resolution training, optimized VAE and language encoders, and targeted RL fine‑tuning.

Benchmarkingdense captioningimage generation
0 likes · 14 min read
How a 3.8B Model Beats 6B+ Models Using Just 20% of the Compute – Inside Microsoft Lens
Machine Heart
Machine Heart
Apr 29, 2026 · Artificial Intelligence

Beyond VLA and World Models: Galaxy General Unveils LDA‑1B to Scale Embodied Data

LDA‑1B unifies world modeling and VLA in a latent dynamics action model, ingesting over 30 000 hours of heterogeneous embodied data via a five‑layer AstraData pipeline, employing a unified end‑effector space and quality‑based data allocation, and achieving state‑of‑the‑art success rates on RoboCasa‑GR1 while being fully open‑sourced.

Embodied AIScaling Lawdata ingestion
0 likes · 13 min read
Beyond VLA and World Models: Galaxy General Unveils LDA‑1B to Scale Embodied Data
SuanNi
SuanNi
Feb 28, 2026 · Artificial Intelligence

How SkyReels V4 Achieves Synchronized Audio‑Video Generation at Film Quality

The article provides an in‑depth technical analysis of SkyReels V4, a multimodal diffusion model that generates ultra‑high‑definition, long‑duration videos with perfectly synchronized sound, detailing its dual‑stream architecture, channel‑concatenation strategy, efficient refinement pipeline, training methodology, and benchmark performance.

AI video generationBenchmarkaudio‑video synchronization
0 likes · 13 min read
How SkyReels V4 Achieves Synchronized Audio‑Video Generation at Film Quality
AI Frontier Lectures
AI Frontier Lectures
Jan 30, 2026 · Artificial Intelligence

Inside MOVA: Open-Source End-to-End Audio-Video Generation

OpenMOSS and MOSI unveiled MOVA, China’s first high‑performance open‑source audio‑video generation model, detailing its dual‑tower architecture, bridge module, aligned ROPE, multi‑stage data pipeline, training strategies, dual CFG guidance, and benchmark results that surpass leading closed‑source systems.

MOVAaudio-video generationmodel architecture
0 likes · 20 min read
Inside MOVA: Open-Source End-to-End Audio-Video Generation
AI Algorithm Path
AI Algorithm Path
Aug 16, 2025 · Artificial Intelligence

Qwen-Image: The Best Open‑Source AI Image Generation Model Unveiled

Qwen-Image, an open‑source multimodal diffusion model, introduces a three‑component architecture, dual‑stream encoding, and a novel MSRoPE positional scheme to achieve superior text‑aligned image generation, with extensive benchmark results, detailed data engineering, progressive training strategies, and publicly released weights for easy access.

AI image generationBenchmarkMSRoPE
0 likes · 9 min read
Qwen-Image: The Best Open‑Source AI Image Generation Model Unveiled
AI Algorithm Path
AI Algorithm Path
Jun 3, 2025 · Artificial Intelligence

Inside Tencent’s HunyuanVideo-Avatar: How Open‑Source AI Generates Digital Human Videos

Tencent’s HunyuanVideo-Avatar converts a static portrait and an audio clip into a lip‑synced, expressive video using a multimodal diffusion Transformer, offering open‑source weights, detailed module designs, hardware requirements, code examples, and a candid assessment of its strengths and current limitations.

AI video generationCUDAHunyuanVideo-Avatar
0 likes · 8 min read
Inside Tencent’s HunyuanVideo-Avatar: How Open‑Source AI Generates Digital Human Videos
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jul 31, 2023 · Artificial Intelligence

Boosting Large Model Inference: High‑Performance Optimization Techniques

This article explains the background, challenges, and high‑performance optimization methods for deploying large language and multimodal models, covering inference workflow analysis, distributed concurrency, latency reduction, quantization strategies, and service throughput improvements to achieve industry‑leading speed and memory efficiency.

distributed inferencemultimodal diffusionquantization
0 likes · 12 min read
Boosting Large Model Inference: High‑Performance Optimization Techniques