Tagged articles
3 articles
Page 1 of 1
SuanNi
SuanNi
May 31, 2026 · Artificial Intelligence

How NVIDIA’s Gamma‑World Turns Single‑Agent Models into Multiplayer Experiences

Gamma‑World introduces a multi‑agent world model that solves identity, interaction, and real‑time inference challenges with parameter‑free geometric encoding, sparse hub attention, and teacher‑student distillation, enabling zero‑shot generalization from two to four agents and achieving 24 FPS interactive video generation.

Gamma-WorldSimplex Rotary Agent EncodingSparse Hub Attention
0 likes · 11 min read
How NVIDIA’s Gamma‑World Turns Single‑Agent Models into Multiplayer Experiences
AI Frontier Lectures
AI Frontier Lectures
Dec 15, 2025 · Artificial Intelligence

How UnityVideo Unifies Multimodal Training to Boost Video Generation

UnityVideo, a new vision framework from HKUST, CUHK, Tsinghua and Kuaishou, unifies training across depth, flow, pose, segmentation and RGB modalities, achieving faster convergence, higher video quality, zero‑shot generalization and stronger physical reasoning compared with existing single‑modality video generators.

AI researchUnityVideoVision Models
0 likes · 15 min read
How UnityVideo Unifies Multimodal Training to Boost Video Generation
AI Frontier Lectures
AI Frontier Lectures
Jul 18, 2025 · Artificial Intelligence

How Anchored Attributes Boost Prompt Learning for Vision‑Language Models

The paper introduces ATPrompt, a method that inserts fixed attribute tokens into learnable prompts for CLIP‑style vision‑language models, enabling the soft prompts to capture generic attribute representations and significantly improve base‑to‑novel generalization without extra regularization losses.

ATPromptVision-Language Modelsattribute anchoring
0 likes · 20 min read
How Anchored Attributes Boost Prompt Learning for Vision‑Language Models