Tagged articles
2 articles
Page 1 of 1
Machine Heart
Machine Heart
Apr 27, 2026 · Artificial Intelligence

ACL 2026: Unveiling a Predictive Scaling Law for Reinforcement Learning Fine‑Tuning of Large Models

The paper presents a systematic empirical study that derives a power‑law scaling formula for reinforcement‑learning‑after‑training of large language models, demonstrating accurate inter‑ and intra‑model performance prediction, learning‑efficiency saturation, data‑reuse benefits, and cross‑architecture validity.

Data ReuseLlama 3Qwen2.5
0 likes · 11 min read
ACL 2026: Unveiling a Predictive Scaling Law for Reinforcement Learning Fine‑Tuning of Large Models