Tagged articles
2 articles
Page 1 of 1
Old Zhang's AI Learning
Old Zhang's AI Learning
May 30, 2026 · Artificial Intelligence

vLLM Introduces Native RL API for Seamless Weight Synchronization

vLLM’s new native RL API introduces a four‑stage weight‑transfer protocol, pluggable backends, and a keep‑mode pause/resume mechanism that eliminates deadlocks in DPEP deployments, with large‑scale validations on SkyRL and Prime‑RL demonstrating reliability and performance gains.

CUDA IPCNCCLRL API
0 likes · 14 min read
vLLM Introduces Native RL API for Seamless Weight Synchronization
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 26, 2026 · Artificial Intelligence

How MiniMax’s Forge Architecture Achieves 40× Faster Agent RL Training

The article details MiniMax’s Forge system, an asynchronous native Agent‑RL architecture that standardizes Agent‑LLM interaction, introduces engineering optimizations, novel scheduling, prefix‑tree merging and reward designs, enabling million‑sample daily throughput, stable reward growth and up to 40‑fold training acceleration for the MiniMax M2.5 model.

Agent ArchitectureMixed SchedulingScalable Systems
0 likes · 17 min read
How MiniMax’s Forge Architecture Achieves 40× Faster Agent RL Training