Author

PaperAgent

Daily updates, analyzing cutting-edge AI research papers

202

Articles

Likes

170

Views

Comments

Latest from PaperAgent

100 recent articles max

PaperAgent

May 9, 2026 · Artificial Intelligence

How ActDistill Slashes Deployment Costs of VLA Large Models

ActDistill, proposed by Tongji University and collaborators, reduces the inference latency, compute consumption, and action-loop speed of Vision‑Language‑Action (VLA) models by selectively distilling action‑relevant knowledge, achieving up to 1.67× speedup while preserving control quality on real robot hardware.

ActDistillDynamic RoutingEfficiency

0 likes · 13 min read

How ActDistill Slashes Deployment Costs of VLA Large Models

PaperAgent

May 8, 2026 · Artificial Intelligence

Jeff Dean’s Decoupled DiLoCo Shatters the Million‑Chip LLM Pre‑training Bottleneck

The article explains how Google’s Decoupled DiLoCo architecture breaks the scalability wall of million‑chip LLM pre‑training by partitioning the cluster into independent learners, using an asynchronous syncer, and achieving up to 88% effective compute while preserving model quality.

AIGoogleLLM

0 likes · 7 min read

Jeff Dean’s Decoupled DiLoCo Shatters the Million‑Chip LLM Pre‑training Bottleneck

PaperAgent

May 7, 2026 · Artificial Intelligence

190 Must-Read AI Agent Papers + 321 Google Implementation Cases – Free Resource Pack

The article provides a free compiled resource containing 190 essential AI Agent papers—from fundamentals to cutting‑edge topics—along with 321 Google‑released implementation cases and 500 open‑source agent applications, all with source code to help beginners and researchers quickly understand the field and reproduce results.

AI agentLLMMemory

0 likes · 6 min read

190 Must-Read AI Agent Papers + 321 Google Implementation Cases – Free Resource Pack

PaperAgent

May 6, 2026 · Artificial Intelligence

How to Detect Introspective Awareness in LLMs – Boosting Detection Rates by 53% and 75%

Anthropic and MIT researchers reveal that large language models can sense injected steering vectors, a capability that emerges during post‑training (especially DPO), and they present a two‑stage detection circuit whose performance improves by up to 75% when reject directions are ablated or bias vectors are trained.

Circuit AnalysisDPOIntrospective Awareness

0 likes · 15 min read

How to Detect Introspective Awareness in LLMs – Boosting Detection Rates by 53% and 75%

PaperAgent

May 4, 2026 · Artificial Intelligence

A Comprehensive Survey of Self-Evolving Agents: From Model-Centric to Environment-Driven Co-Evolution

This survey systematically reviews self‑evolving agents, explains why autonomous agents are needed, proposes a unified taxonomy of three evolution paradigms, analyzes model‑centric, environment‑centric, and co‑evolution approaches, and outlines future challenges in designing adaptive environments.

AI Agent TaxonomyCo-EvolutionEnvironment-Centric Evolution

0 likes · 14 min read

A Comprehensive Survey of Self-Evolving Agents: From Model-Centric to Environment-Driven Co-Evolution

PaperAgent

May 4, 2026 · Artificial Intelligence

Why Claude 4.6 Scores Only 66%: Claw‑Eval‑Live Shows Terminal Skills Aren’t Enough

The article explains that modern AI agents must be judged on actual task execution and audit evidence, and Claw‑Eval‑Live reveals that while agents can use terminals, they still fail dramatically on cross‑system workflows such as HR, management, and operations, with no model surpassing a 70% pass rate.

AI AgentsClaw-EvalLLM

0 likes · 7 min read

Why Claude 4.6 Scores Only 66%: Claw‑Eval‑Live Shows Terminal Skills Aren’t Enough

PaperAgent

May 3, 2026 · Artificial Intelligence

Skill Graphs Reveal Why Training Diversity Beats Quantity for Terminal Agents

The paper shows that, instead of increasing the number of training tasks, controlling the diversity of scene‑skill combinations via a large‑scale Skill Graph dramatically improves terminal‑agent performance, with Qwen3‑32B surpassing a 480B model on the Terminal‑Bench 2.0 benchmark.

LLMQwen3Skill Graphs

0 likes · 9 min read

Skill Graphs Reveal Why Training Diversity Beats Quantity for Terminal Agents

PaperAgent

May 2, 2026 · Artificial Intelligence

Can Harnesses Self‑Evolve? Fudan & Peking University’s Agentic Harness Engineering Breakthrough

The paper introduces Agentic Harness Engineering (AHE), showing that a 10‑round evolution improves Coding Agent pass@1 from 69.7% to 77.0% on Terminal‑Bench 2—outperforming Codex‑CLI—and that the evolved harness transfers zero‑shot to SWE‑bench and multiple model families, thanks to three observability pillars.

Ablation StudyCoding AgentHarness Engineering

0 likes · 11 min read

Can Harnesses Self‑Evolve? Fudan & Peking University’s Agentic Harness Engineering Breakthrough

PaperAgent

Apr 30, 2026 · Artificial Intelligence

DeepSeek Unveils Open‑Source Multimodal Model: “Thinking with Visual Primitives”

DeepSeek releases an open‑source multimodal LLM that introduces a visual‑primitive framework—elevating bounding boxes and points to token level—to close the reference gap, achieve extreme KV‑cache compression, and outperform GPT‑5.4, Claude‑Sonnet‑4.6 and Gemini‑3‑Flash on counting, spatial reasoning, maze navigation and path‑tracing benchmarks.

DeepSeekLLMMultimodal

0 likes · 13 min read

DeepSeek Unveils Open‑Source Multimodal Model: “Thinking with Visual Primitives”

PaperAgent

Apr 30, 2026 · Artificial Intelligence

Why Reinforcement Learning Is the Future: 2026 Top‑Conference RL Paper Collection

The article highlights the rapid rise of reinforcement learning across major 2026 conferences, curates 181 RL papers from eight top venues, and provides detailed summaries of innovative works such as MSRL and MedVR, offering free access to the papers and code.

Large ModelsReward Modelingagentic RL

0 likes · 6 min read

Why Reinforcement Learning Is the Future: 2026 Top‑Conference RL Paper Collection