May 28, 2026 · Artificial Intelligence

Large-Model RL Advances: Credit Allocation, Complex Reasoning, Agent Learning

HyperAI curates six cutting‑edge large‑model reinforcement‑learning papers—from ECHO’s free world‑model learning to DelTA’s discriminative token credit, GoLongRL’s capability‑oriented long‑context RL, Anti‑SD’s reverse distillation, RubricEM’s rubric‑guided policy decomposition, and Poly‑EPO’s diversity‑driven exploration—highlighting their methods, benchmarks, and performance gains.

Agent LearningComplex ReasoningCredit Assignment

0 likes · 10 min read

Large-Model RL Advances: Credit Allocation, Complex Reasoning, Agent Learning

Machine Learning Algorithms & Natural Language Processing

Apr 29, 2026 · Artificial Intelligence

Dual Engine for Training and Inference: How Princeton’s SD‑ZERO and AggAgent Redefine Complex Reasoning

The article reviews two recent Princeton papers—SD‑ZERO, which introduces self‑revision training and on‑policy self‑distillation to turn a model’s own error traces into dense supervision, and AggAgent, which actively aggregates parallel long‑horizon trajectories—showing how internal trajectory mining can cut compute costs and boost accuracy on challenging math and code benchmarks.

AggAgentComplex Reasoninglarge language models

0 likes · 10 min read

Dual Engine for Training and Inference: How Princeton’s SD‑ZERO and AggAgent Redefine Complex Reasoning