Tagged articles

39 articles

Page 1 of 1

May 24, 2026 · Artificial Intelligence

Wrapping Up Harness Engineering: The Six Pillars Methodology Explained

This article reviews the six foundational pillars of Harness Engineering—context architecture, architectural constraints, self‑verification loop, context isolation, entropy governance, and detachability—showing how Claude Code implements them, why infrastructure, not model size, is the real bottleneck, and offering ten concrete actions for practitioners.

AI AgentsContext CompressionEntropy Management

0 likes · 17 min read

Wrapping Up Harness Engineering: The Six Pillars Methodology Explained

AI Engineer Programming

May 10, 2026 · Artificial Intelligence

Lossless Context Management (LCM): Handling Unlimited Agent Tasks with Finite Windows

The article analyzes the limitation of finite LLM context windows for unbounded agent tasks, reviews existing truncation, summarization, and RAG approaches, and presents the Lossless Context Management (LCM) architecture with immutable storage, hierarchical DAG compression, three‑level summarization, and zero‑overhead processing for both short and large‑scale workloads.

AI AgentsAgent MemoryAgentic-Map

0 likes · 9 min read

Lossless Context Management (LCM): Handling Unlimited Agent Tasks with Finite Windows

Architect's Ambition

May 8, 2026 · Artificial Intelligence

A 12,000‑Word Guide to Agent Harness: Designing and Implementing Production‑Ready AI Agents

The article presents a comprehensive 7‑layer Agent Harness architecture that transforms experimental LLM‑based agents into stable, cost‑effective, secure, and observable production‑grade autonomous workers, illustrated with real‑world case studies, performance metrics, and concrete implementation details.

AI AgentsAgent ArchitectureContext Compression

0 likes · 33 min read

A 12,000‑Word Guide to Agent Harness: Designing and Implementing Production‑Ready AI Agents

Machine Heart

May 7, 2026 · Artificial Intelligence

How TACO Lets CLI Agents Self‑Evolve to Drop Useless Context

TACO is a plug‑and‑play, training‑free framework that lets terminal‑based autonomous agents automatically learn compression rules to filter low‑value output while preserving critical decision cues, achieving higher task success rates and better token efficiency across multiple terminal‑related benchmarks.

Context CompressionLLMSelf‑Evolving Rules

0 likes · 14 min read

How TACO Lets CLI Agents Self‑Evolve to Drop Useless Context

AI Tech Publishing

May 1, 2026 · Artificial Intelligence

5 Counterintuitive Design Principles for Prompt Caching in Claude Code

The article details five counterintuitive design principles for Claude Code's prompt caching—optimizing prompt layout, using message‑based updates, never switching models or tools mid‑conversation, safely compressing context, and monitoring cache health—backed by concrete examples and up to 90% cost savings.

AI EngineeringCache OptimizationClaude Code

0 likes · 10 min read

5 Counterintuitive Design Principles for Prompt Caching in Claude Code

AI Tech Publishing

May 1, 2026 · Artificial Intelligence

Turning Harness into a Distributed Context Management System for Long‑Task Agents

The article explains why the reliability of long‑task agents now hinges on harness design rather than model strength, and details four harness innovations—programmatic tool calls, sub‑agents as isolation boundaries, context compression, and skill‑search priority—that Glean uses to build a distributed context management system.

Agent HarnessContext CompressionSub‑agents

0 likes · 11 min read

Turning Harness into a Distributed Context Management System for Long‑Task Agents

AI Step-by-Step

Apr 27, 2026 · Artificial Intelligence

Hermes Prompt Runtime: Managing Provider, Prompt, Memory, and Context

Hermes Prompt Runtime introduces a layered architecture that first resolves the model provider, then builds a stable system prompt, freezes memory snapshots for session boundaries, isolates per‑call temporary context, and compresses long histories, thereby keeping long‑term semantics stable, improving prompt caching, and reducing context‑window pressure.

Agent ArchitectureContext CompressionHermes

0 likes · 12 min read

Hermes Prompt Runtime: Managing Provider, Prompt, Memory, and Context

Alibaba Cloud Developer

Apr 24, 2026 · Artificial Intelligence

How Hermes Agent Achieves Self‑Evolution: A Deep Dive into Prompt, Context, and Harness Design

This article provides a detailed technical analysis of Hermes Agent, explaining how its dynamic skill generation and reinforcement‑learning loop enable true self‑evolution, and examines the prompt engineering, context compression, memory architecture, harness mechanisms, error handling, and plugin ecosystem that differentiate it from OpenClaw and Claude Code.

Agent FrameworkContext CompressionHermes Agent

0 likes · 41 min read

How Hermes Agent Achieves Self‑Evolution: A Deep Dive into Prompt, Context, and Harness Design

Shuge Unlimited

Apr 23, 2026 · Artificial Intelligence

Deep Dive into Hermes Agent: Self‑Improving AI Agent Architecture with 110K+ Stars

Hermes Agent, an open‑source self‑improving AI agent framework that has amassed over 110 K GitHub stars, introduces a native closed‑learning loop, a unified single‑process agent cycle, self‑registering tools, pluggable context compression, multi‑API model support, and a scalable multi‑platform gateway, all built on Python 3.11+, SQLite + WAL, and extensive modular design.

AI agentClosed Loop LearningContext Compression

0 likes · 24 min read

Deep Dive into Hermes Agent: Self‑Improving AI Agent Architecture with 110K+ Stars

ZhiKe AI

Apr 19, 2026 · Artificial Intelligence

Claude Code Agent Architecture: 4‑Layer Breakdown & 3 Key Designs from while(true) to Autonomous Decision‑Making

The article dissects Claude Code Agent's four‑layer architecture—interaction, orchestration, execution, and infrastructure—explaining how the perpetual while(true) queryLoop drives LLM streaming, how StreamingToolExecutor enables safe concurrent tool execution, and how the autoCompact four‑stage compression with a circuit‑breaker safeguards context length.

Claude Code AgentContext CompressionLLM tool integration

0 likes · 19 min read

Claude Code Agent Architecture: 4‑Layer Breakdown & 3 Key Designs from while(true) to Autonomous Decision‑Making

o-ai.tech

Apr 17, 2026 · Artificial Intelligence

How Hermes Agent Self‑Evolves: Memory, Skills, and Offline Training Pipelines

This article dissects Hermes Agent’s self‑evolution mechanism, explaining how stable facts are stored in memory, reusable procedures become skills, and rollout trajectories are turned into training data through background review, context compression, and OPD‑based token‑level distillation.

Agent ArchitectureContext CompressionHermes Agent

0 likes · 33 min read

How Hermes Agent Self‑Evolves: Memory, Skills, and Offline Training Pipelines

Tech Verticals & Horizontals

Apr 15, 2026 · Artificial Intelligence

How Hermes Enables AI to Remember, Learn, and Grow Autonomously

The article dissects Hermes’s autonomous learning loop, detailing how immutable facts are stored in long‑term memory, reusable methods become skills, session history is searchable, and a background review process periodically consolidates knowledge while a pre‑compression rescue safeguards key information.

AIContext CompressionHermes

0 likes · 15 min read

How Hermes Enables AI to Remember, Learn, and Grow Autonomously

Machine Heart

Apr 13, 2026 · Artificial Intelligence

What’s the Underlying Logic of Coding Agents and Why Do Claude Code Variants Outperform Others?

The article dissects coding agents by outlining their six core components, explaining how an agent harness orchestrates model inference, repository context, prompt caching, tool validation, context compression, structured memory, and bounded sub‑agents, and shows why these architectural choices give Claude Code a performance edge over plain LLMs.

Agent HarnessContext CompressionLLM

0 likes · 22 min read

What’s the Underlying Logic of Coding Agents and Why Do Claude Code Variants Outperform Others?

AI Tech Publishing

Apr 12, 2026 · Artificial Intelligence

How Hermes Agent’s Multi‑Layer Memory Beats OpenClaw’s Simple Markdown Store

The article dissects Hermes Agent’s four‑store memory architecture—declarative, procedural, situational, and persona—deterministic routing, frozen snapshots, nudge‑driven persistence, security scanning, dual‑peer modeling, skill management, and three‑phase context compression, showing why it outperforms OpenClaw’s breadth‑first design.

Context CompressionHermes AgentLLM agents

0 likes · 17 min read

How Hermes Agent’s Multi‑Layer Memory Beats OpenClaw’s Simple Markdown Store

macrozheng

Apr 10, 2026 · Artificial Intelligence

Inside Claude Code: How a 500k‑Line AI Programming Tool Leaked and What Its Architecture Reveals

The Claude Code source leak exposed over 500,000 lines of AI‑coding tool code, revealing its npm publishing mishap, the layered architecture built on React Ink, the ReAct‑style agent loop, sophisticated tool orchestration, multi‑tier memory management, context compression, security checks, feature flags, and even anti‑distillation defenses.

AI AgentsClaude CodeContext Compression

0 likes · 30 min read

Inside Claude Code: How a 500k‑Line AI Programming Tool Leaked and What Its Architecture Reveals

IT Services Circle

Apr 6, 2026 · Artificial Intelligence

Mastering RAG Interview Questions: A Complete Retrieval Optimization Blueprint

This article breaks down the full RAG retrieval pipeline—from query understanding and rewriting, through hybrid retrieval and reranking, to chunking, context compression, and dynamic routing—providing concrete techniques, formulas, and performance metrics to help candidates ace interview questions on RAG systems.

Context CompressionCross-EncoderHard Negative Mining

0 likes · 16 min read

Mastering RAG Interview Questions: A Complete Retrieval Optimization Blueprint

AI Tech Publishing

Apr 6, 2026 · Artificial Intelligence

Six Core Components of a Coding Agent Explained with Code

The article systematically breaks down the six essential building blocks of a programming agent—live repository context, prompt shape and cache reuse, structured tool access and validation, context reduction, structured session memory, and bounded sub‑agent delegation—illustrated with a Mini Coding Agent implementation and comparisons to Claude Code, Codex, and OpenClaw.

Coding AgentContext CompressionLLM

0 likes · 15 min read

Six Core Components of a Coding Agent Explained with Code

Shuge Unlimited

Apr 4, 2026 · Artificial Intelligence

Inside Claude Code: Three‑Tier Compression Enabling Unlimited‑Length AI Tasks

The article dissects Claude Code's three‑level progressive compression system—MicroCompact, SessionMemoryCompact, and Full Compact—showing how it edits cached prompts, maintains background memory files, and generates a structured nine‑section summary to keep AI agents operating over arbitrarily long conversations within a limited context window.

AI agentAuto CompressionClaude Code

0 likes · 17 min read

Inside Claude Code: Three‑Tier Compression Enabling Unlimited‑Length AI Tasks

AI Open-Source Efficiency Guide

Apr 1, 2026 · Artificial Intelligence

Build an AI Agent Harness from Scratch: Deep Dive into Claude Code Architecture

This article walks developers through the learn-claude-code project, teaching them how to construct a Claude‑style AI Agent Harness by covering twelve progressive lessons, core concepts such as agents, harnesses, sub‑agents, context compression, task management, and providing runnable Python examples and architectural diagrams.

AI agentAgent HarnessClaude Code

0 likes · 13 min read

Build an AI Agent Harness from Scratch: Deep Dive into Claude Code Architecture

ArcThink

Apr 1, 2026 · Artificial Intelligence

Inside Claude Code: 1,900‑File Source Dive Reveals Six‑Layer Architecture

After a source‑map leak exposed Claude Code’s 1,900 TypeScript files, this analysis dissects its six‑layer architecture, dynamic prompt assembly, four‑level caching, 60+ tool governance pipeline, six built‑in agents, five context‑compression strategies, and the real engineering trade‑offs hidden beneath the product.

AI EngineeringAgent SystemsContext Compression

0 likes · 31 min read

Inside Claude Code: 1,900‑File Source Dive Reveals Six‑Layer Architecture

Wu Shixiong's Large Model Academy

Mar 29, 2026 · Artificial Intelligence

Mastering RAG Prompt Engineering: Prevent Hallucinations and Boost Accuracy

This article dissects the unique challenges of RAG prompting, presents a systematic System/User Prompt design with strong constraints and citation requirements, compares constraint strengths with quantitative hallucination rates, and offers long‑context compression strategies and rigorous testing methods to ensure reliable LLM answers.

Context CompressionLLMRAG

0 likes · 19 min read

Mastering RAG Prompt Engineering: Prevent Hallucinations and Boost Accuracy

Fighter's World

Mar 28, 2026 · Artificial Intelligence

What Engineering Decisions Make AI Coding Agents Effective? Lessons from the OpenDev Paper

The article dissects OpenDev’s open‑source AI coding agent, comparing its scaffolding‑vs‑harness architecture, cognitive‑flow design, context‑compression strategies, tool‑reliability mechanisms and safety layers with Claude Code, Cursor, Codex and Augment, and shows that harness‑level engineering remains the biggest performance lever even for frontier models.

AI coding agentContext CompressionHarness Engineering

0 likes · 39 min read

What Engineering Decisions Make AI Coding Agents Effective? Lessons from the OpenDev Paper

Su San Talks Tech

Mar 26, 2026 · Artificial Intelligence

Unlocking AI Agents: How OpenClaw Turns Language Models into Actionable Bots

This article explains how OpenClaw functions as an AI Agent framework that connects chat applications to large language models, manages multi‑turn dialogues, executes tool commands, handles memory and security, and demonstrates advanced features such as sub‑agents, cron jobs, and context compression.

AI agentContext CompressionMemory Management

0 likes · 19 min read

Unlocking AI Agents: How OpenClaw Turns Language Models into Actionable Bots

SuanNi

Mar 24, 2026 · Artificial Intelligence

How Compression, Orchestration, and LangGraph Are Redefining LLM Context Engineering

This article analyzes the six pillars of context engineering for large language models, focusing on compression techniques, extractive vs. abstractive methods, the LLMLingua toolkit, dynamic orchestration with routing and agentic RAG, and how LangGraph enables sophisticated agent‑driven workflows.

Agentic RAGContext CompressionLLM

0 likes · 14 min read

How Compression, Orchestration, and LangGraph Are Redefining LLM Context Engineering

Shi's AI Notebook

Mar 15, 2026 · Artificial Intelligence

How OpenAI Turns Models into Agents by Adding a Computer Environment to the Responses API

The article explains how OpenAI extends the Responses API with a sandboxed computer environment—shell tools, container workspaces, network controls, context compression, and reusable skills—to let language models execute complex, stateful workflows safely and efficiently.

AI AgentsContext CompressionOpenAI

0 likes · 14 min read

How OpenAI Turns Models into Agents by Adding a Computer Environment to the Responses API

AI Explorer

Mar 14, 2026 · Artificial Intelligence

Build a Claude‑Code‑Level AI Agent in 12 Incremental Lessons

This open‑source tutorial walks developers through twelve progressive lessons, expanding a minimal 84‑line agent to a full‑featured 694‑line Claude‑Code‑style AI system that covers tool calls, sub‑agents, context compression, and multi‑agent collaboration.

AI agentAgent LoopClaude Code

0 likes · 9 min read

Build a Claude‑Code‑Level AI Agent in 12 Incremental Lessons

Machine Learning Algorithms & Natural Language Processing

Feb 24, 2026 · Artificial Intelligence

How COMI Achieves 25‑Point Performance Gains at 32× Compression Using Marginal Information Gain (ICLR 2026)

The COMI framework introduces a marginal information gain metric and a coarse‑to‑fine adaptive compression strategy that preserves relevance and diversity, enabling 32× text compression while boosting downstream QA performance by up to 25 points and doubling inference speed.

Context CompressionEfficient InferenceLong-Context Retrieval

0 likes · 7 min read

How COMI Achieves 25‑Point Performance Gains at 32× Compression Using Marginal Information Gain (ICLR 2026)

Machine Learning Algorithms & Natural Language Processing

Feb 23, 2026 · Artificial Intelligence

How COMI Achieves 32× Compression and Boosts Performance by 25 Points

The COMI framework introduces a marginal information gain metric and a coarse‑to‑fine two‑stage compression strategy that preserves relevance and diversity, enabling 32× context reduction while improving Exact Match on NaturalQuestions by nearly 25 points and more than doubling inference speed.

Context CompressionLong-Context RetrievalMarginal Information Gain

0 likes · 7 min read

How COMI Achieves 32× Compression and Boosts Performance by 25 Points

Wuming AI

Jan 29, 2026 · Artificial Intelligence

How to Compress Long LLM Conversations with Smart Summarization and Sliding Window

This article explains how to keep essential information from lengthy AI chat histories by using an intelligent summarization prompt, injecting the summary as a system message, and applying a sliding‑window strategy that retains the last three exchanges, thereby reducing token cost and preserving context continuity.

CContext CompressionLLM

0 likes · 11 min read

How to Compress Long LLM Conversations with Smart Summarization and Sliding Window

Architect

Jan 28, 2026 · Artificial Intelligence

How to Build a Reliable Long-Term Memory System for AI Agents

Designing a robust AI memory for long-running agents requires separating context from persistent storage, using markdown files, pre‑compaction flushing, hybrid vector‑BM25 retrieval, session pruning, and rebuildable SQLite indexes, ensuring explainable, editable, and portable recall while preventing context bloat and security leaks.

AI memoryClawdbotContext Compression

0 likes · 19 min read

How to Build a Reliable Long-Term Memory System for AI Agents

PaperAgent

Jan 28, 2026 · Artificial Intelligence

How Clawdbot Achieves Persistent, Local Memory for LLM Agents

Clawdbot implements a fully local, persistent memory system for LLM agents by storing context and long‑term knowledge in editable Markdown files, indexing them with SQLite‑vec and FTS5, supporting multi‑agent isolation, compression, pruning, and configurable session lifecycles to maintain efficient, cost‑effective interactions.

Context CompressionLLM agentslocal storage

0 likes · 13 min read

How Clawdbot Achieves Persistent, Local Memory for LLM Agents

AI Engineering

Jan 18, 2026 · Artificial Intelligence

Why a Single For Loop Powers BU’s Open‑Source Agent Framework

The BU Browser Use team open‑sourced bu‑agent‑sdk, a minimal LLM agent framework that treats the agent as a simple for‑loop and adds explicit done tools, context compression, ephemeral messages, and a unified LLM interface, enabling flexible, low‑overhead AI applications.

Agent FrameworkContext CompressionLLM

0 likes · 7 min read

Why a Single For Loop Powers BU’s Open‑Source Agent Framework

AI Insight Log

Dec 27, 2025 · Industry Insights

Why AI Code Generators Like Cursor Could Trigger an Infinite Software Crisis – Lessons from Netflix

Netflix senior engineer Jake Nations warns that the rise of AI‑powered code generators creates an "Infinite Software Crisis" by turning easy code generation into unmaintainable complexity, and outlines a three‑step "Context Compression" method to keep development disciplined and understandable.

AI Code GenerationContext CompressionNetflix engineering

0 likes · 10 min read

Why AI Code Generators Like Cursor Could Trigger an Infinite Software Crisis – Lessons from Netflix

High Availability Architecture

Dec 26, 2025 · Artificial Intelligence

Why AI-Generated Code Threatens Understanding: A Netflix Engineer’s Three‑Stage Method

In a Netflix talk, senior engineer Jake Nations reveals how AI can instantly produce code yet leave developers clueless, explains the historic software crisis, distinguishes essential from accidental complexity, and outlines a three‑stage "context compression" process to keep speed without sacrificing comprehension.

AIContext CompressionNetflix

0 likes · 20 min read

Why AI-Generated Code Threatens Understanding: A Netflix Engineer’s Three‑Stage Method

AI Tech Publishing

Nov 30, 2025 · Artificial Intelligence

Agent Architecture Design Part 1: Context Compression Strategies and Their Use Cases

The article explains why large‑model agents need context compression, outlines five engineering‑level schemes (both lossless and lossy), demonstrates each with concrete XML snippets and step‑by‑step reasoning, and advises using lossless methods before resorting to lossy prompt‑driven compression.

AgentContext CompressionLLM

0 likes · 12 min read

Agent Architecture Design Part 1: Context Compression Strategies and Their Use Cases

DataFunTalk

Oct 20, 2025 · Artificial Intelligence

How DeepSeek-OCR Achieves 10× Context Compression with Vision Tokens

DeepSeek-OCR, a newly open‑sourced 3B‑parameter OCR model, uses a novel DeepEncoder and a 3B MoE decoder to compress long‑text contexts into visual tokens, achieving up to 10× compression with 97% accuracy and demonstrating strong practical performance on benchmarks and multilingual documents.

Context CompressionDeepSeekOCR

0 likes · 11 min read

How DeepSeek-OCR Achieves 10× Context Compression with Vision Tokens

Alibaba Cloud Developer

Sep 9, 2025 · Artificial Intelligence

Tackling Real‑World Challenges in Multi‑Agent React: From ToolCalls to Context Compression

This article analyzes production‑grade issues of a multi‑agent React framework—such as long ToolCall latency, context bloat, missing intermediate states, loop control, and supervision gaps—and presents concrete XML‑based tool‑call prompts, context‑compression techniques, summary tools, and a plug‑and‑play MCP supervisor that together improve performance, reliability, and user‑facing output quality.

AI PlanningContext CompressionReAct pattern

0 likes · 16 min read

Tackling Real‑World Challenges in Multi‑Agent React: From ToolCalls to Context Compression

Architecture and Beyond

Sep 6, 2025 · Artificial Intelligence

How AI Agents Manage Context: Compression Strategies from Manus, Claude Code, and Gemini CLI

This article examines the context explosion problem in AI agents and compares three distinct compression approaches—Manus's never‑lose philosophy, Claude Code's aggressive 92% threshold with eight‑section summaries, and Gemini CLI's balanced 70% trigger with curated history—highlighting their trade‑offs in performance, cost, and reliability.

AIAgent DesignContext Compression

0 likes · 19 min read

How AI Agents Manage Context: Compression Strategies from Manus, Claude Code, and Gemini CLI

Instant Consumer Technology Team

Jul 24, 2025 · Artificial Intelligence

Inside Claude Code: How a Local AI Agent OS Was Reverse‑Engineered

An in‑depth reverse‑engineering of Anthropic’s Claude Code reveals its multi‑agent architecture, real‑time steering queue, and novel context‑compression engine, exposing the 50k+ line obfuscated JavaScript core, the Agent scheduling layers, tool ecosystem, and storage system that power this local AI coding assistant.

AI AgentsClaude CodeContext Compression

0 likes · 11 min read

Inside Claude Code: How a Local AI Agent OS Was Reverse‑Engineered