AutoResearch: 630‑Line AI Agent That Self‑Evolves in 72 Hours and Earns 12.7k Stars

AutoResearch is a 630‑line Python project that lets an AI agent autonomously run machine‑learning experiments on a single GPU using a fixed five‑minute budget, a single val_bpb metric, automatic code edits, and git‑based decisions, showcasing a minimal yet complete training framework with the novel MuonAdamW optimizer.

AI AgentAutoResearchLLM research

0 likes · 17 min read

AutoResearch: 630‑Line AI Agent That Self‑Evolves in 72 Hours and Earns 12.7k Stars

Data Thinking Notes

Jun 24, 2025 · Artificial Intelligence

Anthropic’s Multi‑Agent Research System: Architecture, Lessons & 90% Performance Boost

Anthropic’s detailed post explains how its new Research feature uses a multi‑agent architecture with a lead coordinator and parallel sub‑agents, covering design principles, prompt engineering tricks, evaluation methods, production reliability challenges, and the substantial performance gains achieved over single‑agent baselines.

AI ArchitectureLLM researchPrompt Engineering

0 likes · 21 min read

Anthropic’s Multi‑Agent Research System: Architecture, Lessons & 90% Performance Boost

Alimama Tech

Dec 25, 2024 · Artificial Intelligence

WiS Platform: Evaluating LLM Multi-Agent Systems via Game-Based Analysis

The WiS Platform provides a game‑based environment for benchmarking large language models in multi‑agent settings, measuring reasoning, deception and collaboration through dynamic scenarios, offering fair experimental design, real‑time competition, visualizations, detailed metrics, and open‑source tools, with GPT‑4o outperforming other models such as Qwen2.5‑72B‑Instruct.

AI evaluationDefense StrategiesGame-Based Testing

0 likes · 8 min read

WiS Platform: Evaluating LLM Multi-Agent Systems via Game-Based Analysis

NewBeeNLP

Aug 3, 2024 · Artificial Intelligence

Extending LLM Context to 1M Tokens: SAMBA, CoPE, RoPE, Retrieval Heads & Infini‑Attention

This article reviews recent research on extending large language model context windows to millions of tokens, covering SAMBA's hybrid architecture, Contextual Position Encoding (CoPE), RoPE base length theory, Retrieval Head analysis, and the memory‑efficient Infini‑Attention mechanism.

Efficient AttentionLLM researchLarge Language Models

0 likes · 10 min read

Extending LLM Context to 1M Tokens: SAMBA, CoPE, RoPE, Retrieval Heads & Infini‑Attention