Artificial Intelligence 7 min read

Key Takeaways from Ant Group and Tsinghua’s Presentations on the AReaL Reinforcement Learning Framework and AWorld Multi‑Agent Framework at ICLR 2025

At ICLR 2025 in Singapore, Ant Group and Tsinghua University showcased the open‑source reinforcement‑learning platform AReaL and the multi‑agent system AWorld, highlighting their recent breakthroughs, system design challenges, performance results on the GAIA benchmark, and upcoming development plans.

AntTech
AntTech
AntTech
Key Takeaways from Ant Group and Tsinghua’s Presentations on the AReaL Reinforcement Learning Framework and AWorld Multi‑Agent Framework at ICLR 2025

On the first day of ICLR 2025 in Singapore, Ant Group and researchers from Tsinghua University presented their open‑source AI frameworks: AReaL for reinforcement learning and AWorld for multi‑agent systems.

AReaL (Ant Reasoning RL) released version 0.2, called AReaL‑boba, built on Qwen‑R1‑Distill‑7B and achieving state‑of‑the‑art performance within two days of training; a lightweight Qwen‑32B‑Distill model reproduced near‑Qwen‑32B results using only 200 data points.

Highlights for AReaL include the importance of RL in model development, the complexity of RL systems compared with supervised learning, and the need for scalable, flexible infrastructure to support diverse algorithms and dynamic data generation.

The team introduced REAL‑HF, a system that dynamically adjusts GPU allocation and parallelism for human‑feedback RL (e.g., PPO), significantly speeding up training.

For inference‑model RL, AReaL addressed challenges of extremely long output lengths (16k‑32k tokens) by offloading reward computation to CPU clusters, employing a dedicated long‑text generation engine, and intelligently batching variable‑length outputs to improve efficiency.

AReaL is the first fully open‑source RL training framework in China, offering reproducible code, data, and scripts, with plans to release an even higher‑performance version within a month and a low‑cost training option to encourage community participation.

AWorld, the multi‑agent framework released earlier this year, achieved a GAIA benchmark score of 69.7, ranking third overall and first among fully open‑source projects, demonstrating strong performance on Level 1 & 2 tests.

Key observations for AWorld include the rapid growth of AI‑assistant demand, the importance of collaborative multi‑agent systems over single super‑agents, and the vision of AI as a practical problem‑solving tool.

AWorld follows a “simplicity” design principle, defining agents as model‑driven tools with three integration modes (direct tools, MCP server, agent tools) and two communication prototypes (Handoff and Swarm) to enable complex coordination.

The GAIA benchmark results highlight the advantages of dynamic planning over static workflows and the efficiency gains from MCP server‑based tool usage.

Future work for AWorld focuses on deep integration with the AReaL RL framework, upcoming multi‑agent test suites in May, enhanced training features in June, and active collaboration with the open‑source community; the team is also recruiting talent.

Interested readers are invited to explore the open‑source repositories and consider applying to join the projects.

Open-sourceAI frameworksmulti-agent systemsreinforcement learningICLR2025
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.