PaperAgent
Author

PaperAgent

Daily updates, analyzing cutting-edge AI research papers

202
Articles
1
Likes
170
Views
0
Comments
Recent Articles

Latest from PaperAgent

100 recent articles max
PaperAgent
PaperAgent
May 19, 2026 · Artificial Intelligence

Why Long-Term Memory Needs Vision: How MemEye Evaluates Multimodal Agent Recall

MemEye is a multimodal memory benchmark that tests agents across eight real‑world scenarios, measuring visual evidence granularity and reasoning depth, and reveals that captions fall short for fine‑grained visual recall, highlighting the need for true visual memory in long‑term AI agents.

AI AgentsMemEyebenchmark
0 likes · 4 min read
Why Long-Term Memory Needs Vision: How MemEye Evaluates Multimodal Agent Recall
PaperAgent
PaperAgent
May 18, 2026 · Artificial Intelligence

How MemWeaver Combines Behavioral and Cognitive Memory to Rebuild LLM Personalization

MemWeaver introduces a hierarchical memory that fuses behavior‑level and cognition‑level user signals, enabling large language models to generate more personalized content across multiple tasks, with extensive experiments, ablations, and an efficient incremental update mechanism demonstrating superior performance over strong baselines.

LLM personalizationLaMP benchmarkbehavioral memory
0 likes · 12 min read
How MemWeaver Combines Behavioral and Cognitive Memory to Rebuild LLM Personalization
PaperAgent
PaperAgent
May 17, 2026 · Artificial Intelligence

Turning LLMs into CT Scans: How Alibaba’s Safe‑SAIL Makes AI Decision Black Boxes Transparent

The paper introduces Safe‑SAIL, a Sparse Autoencoder Interpretation Framework for LLMs that provides pre‑explanation metrics, a segment‑level simulation to cut evaluation cost, and a 1,758‑feature safety database, enabling transparent analysis and interactive debugging of large language model safety decisions.

InterpretabilityLLMSafety
0 likes · 12 min read
Turning LLMs into CT Scans: How Alibaba’s Safe‑SAIL Makes AI Decision Black Boxes Transparent
PaperAgent
PaperAgent
May 16, 2026 · Artificial Intelligence

A First Systematic Survey of Agent Skills: Taxonomy, Techniques, and Applications

This survey analyzes the emerging field of Agent Skills, defining a formal skill model, categorizing acquisition pathways, detailing retrieval strategies, and outlining a five‑stage evolution process, while highlighting large‑scale skill repositories and their implications for AI product design.

AI AgentsAgent SkillsSkill Evolution
0 likes · 9 min read
A First Systematic Survey of Agent Skills: Taxonomy, Techniques, and Applications
PaperAgent
PaperAgent
May 15, 2026 · Artificial Intelligence

How a 0.6B Model Beats GPT‑5.2 at Agent Privacy – Introducing MemPrivacy

The article analyzes the long‑standing privacy dilemma of cloud‑based agents, presents MemPrivacy’s three‑stage de‑identification framework and four‑level privacy taxonomy, details its two‑phase training with the MemPrivacy‑Bench dataset, and shows benchmark results where a 0.6B model outperforms GPT‑5.2 while keeping latency under 0.5 seconds.

AgentMemPrivacybenchmark
0 likes · 11 min read
How a 0.6B Model Beats GPT‑5.2 at Agent Privacy – Introducing MemPrivacy
PaperAgent
PaperAgent
May 14, 2026 · Artificial Intelligence

New Paradigm for LLM Alignment: Insights from Two Recent Anthropic Papers

Anthropic's two May papers reveal that simple SFT/RLHF is insufficient for safe LLMs; inserting a model‑spec mid‑training stage and synthetic‑document fine‑tuning dramatically reduces agentic misalignment, improves data efficiency, and enables models to reason about values before acting.

Agentic MisalignmentAnthropicLLM alignment
0 likes · 13 min read
New Paradigm for LLM Alignment: Insights from Two Recent Anthropic Papers
PaperAgent
PaperAgent
May 13, 2026 · Artificial Intelligence

One-for-All Multi-Agent Collaboration: Adaptive Cross-Task Topology Design

The paper introduces OFA-MAS, a one‑for‑all multi‑agent system that learns a universal topology designer using task‑aware graph encoding and a Mixture‑of‑Experts generator, achieving superior performance, OOD generalization, robustness, and efficiency across six major benchmarks.

LLMMixture of ExpertsTask-Aware Graph Encoder
0 likes · 14 min read
One-for-All Multi-Agent Collaboration: Adaptive Cross-Task Topology Design
PaperAgent
PaperAgent
May 11, 2026 · Artificial Intelligence

SkillOS: How Skill Governance Powers Self‑Evolving AI Agents

SkillOS addresses the one‑off nature of current LLM agents by introducing a closed‑loop system where a trainable Skill Curator continuously extracts, updates, and manages reusable skills from execution traces, leading to measurable gains in success rates, efficiency, and cross‑task generalization.

Grouped Task StreamsLLM agentsMeta-Strategy Skills
0 likes · 10 min read
SkillOS: How Skill Governance Powers Self‑Evolving AI Agents
PaperAgent
PaperAgent
May 9, 2026 · Artificial Intelligence

How Anthropic’s Natural Language Autoencoders Open the LLM Black Box

Anthropic’s Natural Language Autoencoders (NLA) translate high‑dimensional LLM activation vectors into readable text, using an Activation Verbalizer and Reconstruction module trained via RL to maximize Fraction of Variance Explained, and reveal internal planning, language bias, tool‑call hallucinations, and hidden reasoning across multiple Claude models.

Activation VerbalizerAnthropicClaude
0 likes · 9 min read
How Anthropic’s Natural Language Autoencoders Open the LLM Black Box