How NVIDIA’s Gamma‑World Turns Single‑Agent Models into Multiplayer Experiences

Gamma‑World introduces a multi‑agent world model that solves identity, interaction, and real‑time inference challenges with parameter‑free geometric encoding, sparse hub attention, and teacher‑student distillation, enabling zero‑shot generalization from two to four agents and achieving 24 FPS interactive video generation.

Gamma-WorldSimplex Rotary Agent EncodingSparse Hub Attention

0 likes · 11 min read

How NVIDIA’s Gamma‑World Turns Single‑Agent Models into Multiplayer Experiences

AI Frontier Lectures

Dec 15, 2025 · Artificial Intelligence

How UnityVideo Unifies Multimodal Training to Boost Video Generation

UnityVideo, a new vision framework from HKUST, CUHK, Tsinghua and Kuaishou, unifies training across depth, flow, pose, segmentation and RGB modalities, achieving faster convergence, higher video quality, zero‑shot generalization and stronger physical reasoning compared with existing single‑modality video generators.

AI researchUnityVideoVision Models

0 likes · 15 min read

How UnityVideo Unifies Multimodal Training to Boost Video Generation

AI Frontier Lectures

Jul 18, 2025 · Artificial Intelligence

How Anchored Attributes Boost Prompt Learning for Vision‑Language Models

The paper introduces ATPrompt, a method that inserts fixed attribute tokens into learnable prompts for CLIP‑style vision‑language models, enabling the soft prompts to capture generic attribute representations and significantly improve base‑to‑novel generalization without extra regularization losses.

ATPromptVision-Language Modelsattribute anchoring

0 likes · 20 min read

How Anchored Attributes Boost Prompt Learning for Vision‑Language Models