Tagged articles
39 articles
Page 1 of 1
James' Growth Diary
James' Growth Diary
May 24, 2026 · Artificial Intelligence

Wrapping Up Harness Engineering: The Six Pillars Methodology Explained

This article reviews the six foundational pillars of Harness Engineering—context architecture, architectural constraints, self‑verification loop, context isolation, entropy governance, and detachability—showing how Claude Code implements them, why infrastructure, not model size, is the real bottleneck, and offering ten concrete actions for practitioners.

AI AgentsContext CompressionEntropy Management
0 likes · 17 min read
Wrapping Up Harness Engineering: The Six Pillars Methodology Explained
AI Engineer Programming
AI Engineer Programming
May 10, 2026 · Artificial Intelligence

Lossless Context Management (LCM): Handling Unlimited Agent Tasks with Finite Windows

The article analyzes the limitation of finite LLM context windows for unbounded agent tasks, reviews existing truncation, summarization, and RAG approaches, and presents the Lossless Context Management (LCM) architecture with immutable storage, hierarchical DAG compression, three‑level summarization, and zero‑overhead processing for both short and large‑scale workloads.

AI AgentsAgent MemoryAgentic-Map
0 likes · 9 min read
Lossless Context Management (LCM): Handling Unlimited Agent Tasks with Finite Windows
Architect's Ambition
Architect's Ambition
May 8, 2026 · Artificial Intelligence

A 12,000‑Word Guide to Agent Harness: Designing and Implementing Production‑Ready AI Agents

The article presents a comprehensive 7‑layer Agent Harness architecture that transforms experimental LLM‑based agents into stable, cost‑effective, secure, and observable production‑grade autonomous workers, illustrated with real‑world case studies, performance metrics, and concrete implementation details.

AI AgentsAgent ArchitectureContext Compression
0 likes · 33 min read
A 12,000‑Word Guide to Agent Harness: Designing and Implementing Production‑Ready AI Agents
Machine Heart
Machine Heart
May 7, 2026 · Artificial Intelligence

How TACO Lets CLI Agents Self‑Evolve to Drop Useless Context

TACO is a plug‑and‑play, training‑free framework that lets terminal‑based autonomous agents automatically learn compression rules to filter low‑value output while preserving critical decision cues, achieving higher task success rates and better token efficiency across multiple terminal‑related benchmarks.

Context CompressionLLMSelf‑Evolving Rules
0 likes · 14 min read
How TACO Lets CLI Agents Self‑Evolve to Drop Useless Context
AI Tech Publishing
AI Tech Publishing
May 1, 2026 · Artificial Intelligence

5 Counterintuitive Design Principles for Prompt Caching in Claude Code

The article details five counterintuitive design principles for Claude Code's prompt caching—optimizing prompt layout, using message‑based updates, never switching models or tools mid‑conversation, safely compressing context, and monitoring cache health—backed by concrete examples and up to 90% cost savings.

AI EngineeringCache OptimizationClaude Code
0 likes · 10 min read
5 Counterintuitive Design Principles for Prompt Caching in Claude Code
AI Tech Publishing
AI Tech Publishing
May 1, 2026 · Artificial Intelligence

Turning Harness into a Distributed Context Management System for Long‑Task Agents

The article explains why the reliability of long‑task agents now hinges on harness design rather than model strength, and details four harness innovations—programmatic tool calls, sub‑agents as isolation boundaries, context compression, and skill‑search priority—that Glean uses to build a distributed context management system.

Agent HarnessContext CompressionSub‑agents
0 likes · 11 min read
Turning Harness into a Distributed Context Management System for Long‑Task Agents
AI Step-by-Step
AI Step-by-Step
Apr 27, 2026 · Artificial Intelligence

Hermes Prompt Runtime: Managing Provider, Prompt, Memory, and Context

Hermes Prompt Runtime introduces a layered architecture that first resolves the model provider, then builds a stable system prompt, freezes memory snapshots for session boundaries, isolates per‑call temporary context, and compresses long histories, thereby keeping long‑term semantics stable, improving prompt caching, and reducing context‑window pressure.

Agent ArchitectureContext CompressionHermes
0 likes · 12 min read
Hermes Prompt Runtime: Managing Provider, Prompt, Memory, and Context
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 24, 2026 · Artificial Intelligence

How Hermes Agent Achieves Self‑Evolution: A Deep Dive into Prompt, Context, and Harness Design

This article provides a detailed technical analysis of Hermes Agent, explaining how its dynamic skill generation and reinforcement‑learning loop enable true self‑evolution, and examines the prompt engineering, context compression, memory architecture, harness mechanisms, error handling, and plugin ecosystem that differentiate it from OpenClaw and Claude Code.

Agent FrameworkContext CompressionHermes Agent
0 likes · 41 min read
How Hermes Agent Achieves Self‑Evolution: A Deep Dive into Prompt, Context, and Harness Design
Shuge Unlimited
Shuge Unlimited
Apr 23, 2026 · Artificial Intelligence

Deep Dive into Hermes Agent: Self‑Improving AI Agent Architecture with 110K+ Stars

Hermes Agent, an open‑source self‑improving AI agent framework that has amassed over 110 K GitHub stars, introduces a native closed‑learning loop, a unified single‑process agent cycle, self‑registering tools, pluggable context compression, multi‑API model support, and a scalable multi‑platform gateway, all built on Python 3.11+, SQLite + WAL, and extensive modular design.

AI agentClosed Loop LearningContext Compression
0 likes · 24 min read
Deep Dive into Hermes Agent: Self‑Improving AI Agent Architecture with 110K+ Stars
ZhiKe AI
ZhiKe AI
Apr 19, 2026 · Artificial Intelligence

Claude Code Agent Architecture: 4‑Layer Breakdown & 3 Key Designs from while(true) to Autonomous Decision‑Making

The article dissects Claude Code Agent's four‑layer architecture—interaction, orchestration, execution, and infrastructure—explaining how the perpetual while(true) queryLoop drives LLM streaming, how StreamingToolExecutor enables safe concurrent tool execution, and how the autoCompact four‑stage compression with a circuit‑breaker safeguards context length.

Claude Code AgentContext CompressionLLM tool integration
0 likes · 19 min read
Claude Code Agent Architecture: 4‑Layer Breakdown & 3 Key Designs from while(true) to Autonomous Decision‑Making
o-ai.tech
o-ai.tech
Apr 17, 2026 · Artificial Intelligence

How Hermes Agent Self‑Evolves: Memory, Skills, and Offline Training Pipelines

This article dissects Hermes Agent’s self‑evolution mechanism, explaining how stable facts are stored in memory, reusable procedures become skills, and rollout trajectories are turned into training data through background review, context compression, and OPD‑based token‑level distillation.

Agent ArchitectureContext CompressionHermes Agent
0 likes · 33 min read
How Hermes Agent Self‑Evolves: Memory, Skills, and Offline Training Pipelines
Tech Verticals & Horizontals
Tech Verticals & Horizontals
Apr 15, 2026 · Artificial Intelligence

How Hermes Enables AI to Remember, Learn, and Grow Autonomously

The article dissects Hermes’s autonomous learning loop, detailing how immutable facts are stored in long‑term memory, reusable methods become skills, session history is searchable, and a background review process periodically consolidates knowledge while a pre‑compression rescue safeguards key information.

AIContext CompressionHermes
0 likes · 15 min read
How Hermes Enables AI to Remember, Learn, and Grow Autonomously
Machine Heart
Machine Heart
Apr 13, 2026 · Artificial Intelligence

What’s the Underlying Logic of Coding Agents and Why Do Claude Code Variants Outperform Others?

The article dissects coding agents by outlining their six core components, explaining how an agent harness orchestrates model inference, repository context, prompt caching, tool validation, context compression, structured memory, and bounded sub‑agents, and shows why these architectural choices give Claude Code a performance edge over plain LLMs.

Agent HarnessContext CompressionLLM
0 likes · 22 min read
What’s the Underlying Logic of Coding Agents and Why Do Claude Code Variants Outperform Others?
AI Tech Publishing
AI Tech Publishing
Apr 12, 2026 · Artificial Intelligence

How Hermes Agent’s Multi‑Layer Memory Beats OpenClaw’s Simple Markdown Store

The article dissects Hermes Agent’s four‑store memory architecture—declarative, procedural, situational, and persona—deterministic routing, frozen snapshots, nudge‑driven persistence, security scanning, dual‑peer modeling, skill management, and three‑phase context compression, showing why it outperforms OpenClaw’s breadth‑first design.

Context CompressionHermes AgentLLM agents
0 likes · 17 min read
How Hermes Agent’s Multi‑Layer Memory Beats OpenClaw’s Simple Markdown Store
macrozheng
macrozheng
Apr 10, 2026 · Artificial Intelligence

Inside Claude Code: How a 500k‑Line AI Programming Tool Leaked and What Its Architecture Reveals

The Claude Code source leak exposed over 500,000 lines of AI‑coding tool code, revealing its npm publishing mishap, the layered architecture built on React Ink, the ReAct‑style agent loop, sophisticated tool orchestration, multi‑tier memory management, context compression, security checks, feature flags, and even anti‑distillation defenses.

AI AgentsClaude CodeContext Compression
0 likes · 30 min read
Inside Claude Code: How a 500k‑Line AI Programming Tool Leaked and What Its Architecture Reveals
IT Services Circle
IT Services Circle
Apr 6, 2026 · Artificial Intelligence

Mastering RAG Interview Questions: A Complete Retrieval Optimization Blueprint

This article breaks down the full RAG retrieval pipeline—from query understanding and rewriting, through hybrid retrieval and reranking, to chunking, context compression, and dynamic routing—providing concrete techniques, formulas, and performance metrics to help candidates ace interview questions on RAG systems.

Context CompressionCross-EncoderHard Negative Mining
0 likes · 16 min read
Mastering RAG Interview Questions: A Complete Retrieval Optimization Blueprint
AI Tech Publishing
AI Tech Publishing
Apr 6, 2026 · Artificial Intelligence

Six Core Components of a Coding Agent Explained with Code

The article systematically breaks down the six essential building blocks of a programming agent—live repository context, prompt shape and cache reuse, structured tool access and validation, context reduction, structured session memory, and bounded sub‑agent delegation—illustrated with a Mini Coding Agent implementation and comparisons to Claude Code, Codex, and OpenClaw.

Coding AgentContext CompressionLLM
0 likes · 15 min read
Six Core Components of a Coding Agent Explained with Code
Shuge Unlimited
Shuge Unlimited
Apr 4, 2026 · Artificial Intelligence

Inside Claude Code: Three‑Tier Compression Enabling Unlimited‑Length AI Tasks

The article dissects Claude Code's three‑level progressive compression system—MicroCompact, SessionMemoryCompact, and Full Compact—showing how it edits cached prompts, maintains background memory files, and generates a structured nine‑section summary to keep AI agents operating over arbitrarily long conversations within a limited context window.

AI agentAuto CompressionClaude Code
0 likes · 17 min read
Inside Claude Code: Three‑Tier Compression Enabling Unlimited‑Length AI Tasks
AI Open-Source Efficiency Guide
AI Open-Source Efficiency Guide
Apr 1, 2026 · Artificial Intelligence

Build an AI Agent Harness from Scratch: Deep Dive into Claude Code Architecture

This article walks developers through the learn-claude-code project, teaching them how to construct a Claude‑style AI Agent Harness by covering twelve progressive lessons, core concepts such as agents, harnesses, sub‑agents, context compression, task management, and providing runnable Python examples and architectural diagrams.

AI agentAgent HarnessClaude Code
0 likes · 13 min read
Build an AI Agent Harness from Scratch: Deep Dive into Claude Code Architecture
ArcThink
ArcThink
Apr 1, 2026 · Artificial Intelligence

Inside Claude Code: 1,900‑File Source Dive Reveals Six‑Layer Architecture

After a source‑map leak exposed Claude Code’s 1,900 TypeScript files, this analysis dissects its six‑layer architecture, dynamic prompt assembly, four‑level caching, 60+ tool governance pipeline, six built‑in agents, five context‑compression strategies, and the real engineering trade‑offs hidden beneath the product.

AI EngineeringAgent SystemsContext Compression
0 likes · 31 min read
Inside Claude Code: 1,900‑File Source Dive Reveals Six‑Layer Architecture
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Mar 29, 2026 · Artificial Intelligence

Mastering RAG Prompt Engineering: Prevent Hallucinations and Boost Accuracy

This article dissects the unique challenges of RAG prompting, presents a systematic System/User Prompt design with strong constraints and citation requirements, compares constraint strengths with quantitative hallucination rates, and offers long‑context compression strategies and rigorous testing methods to ensure reliable LLM answers.

Context CompressionLLMRAG
0 likes · 19 min read
Mastering RAG Prompt Engineering: Prevent Hallucinations and Boost Accuracy
Fighter's World
Fighter's World
Mar 28, 2026 · Artificial Intelligence

What Engineering Decisions Make AI Coding Agents Effective? Lessons from the OpenDev Paper

The article dissects OpenDev’s open‑source AI coding agent, comparing its scaffolding‑vs‑harness architecture, cognitive‑flow design, context‑compression strategies, tool‑reliability mechanisms and safety layers with Claude Code, Cursor, Codex and Augment, and shows that harness‑level engineering remains the biggest performance lever even for frontier models.

AI coding agentContext CompressionHarness Engineering
0 likes · 39 min read
What Engineering Decisions Make AI Coding Agents Effective? Lessons from the OpenDev Paper
Su San Talks Tech
Su San Talks Tech
Mar 26, 2026 · Artificial Intelligence

Unlocking AI Agents: How OpenClaw Turns Language Models into Actionable Bots

This article explains how OpenClaw functions as an AI Agent framework that connects chat applications to large language models, manages multi‑turn dialogues, executes tool commands, handles memory and security, and demonstrates advanced features such as sub‑agents, cron jobs, and context compression.

AI agentContext CompressionMemory Management
0 likes · 19 min read
Unlocking AI Agents: How OpenClaw Turns Language Models into Actionable Bots
SuanNi
SuanNi
Mar 24, 2026 · Artificial Intelligence

How Compression, Orchestration, and LangGraph Are Redefining LLM Context Engineering

This article analyzes the six pillars of context engineering for large language models, focusing on compression techniques, extractive vs. abstractive methods, the LLMLingua toolkit, dynamic orchestration with routing and agentic RAG, and how LangGraph enables sophisticated agent‑driven workflows.

Agentic RAGContext CompressionLLM
0 likes · 14 min read
How Compression, Orchestration, and LangGraph Are Redefining LLM Context Engineering
AI Explorer
AI Explorer
Mar 14, 2026 · Artificial Intelligence

Build a Claude‑Code‑Level AI Agent in 12 Incremental Lessons

This open‑source tutorial walks developers through twelve progressive lessons, expanding a minimal 84‑line agent to a full‑featured 694‑line Claude‑Code‑style AI system that covers tool calls, sub‑agents, context compression, and multi‑agent collaboration.

AI agentAgent LoopClaude Code
0 likes · 9 min read
Build a Claude‑Code‑Level AI Agent in 12 Incremental Lessons
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 24, 2026 · Artificial Intelligence

How COMI Achieves 25‑Point Performance Gains at 32× Compression Using Marginal Information Gain (ICLR 2026)

The COMI framework introduces a marginal information gain metric and a coarse‑to‑fine adaptive compression strategy that preserves relevance and diversity, enabling 32× text compression while boosting downstream QA performance by up to 25 points and doubling inference speed.

Context CompressionEfficient InferenceLong-Context Retrieval
0 likes · 7 min read
How COMI Achieves 25‑Point Performance Gains at 32× Compression Using Marginal Information Gain (ICLR 2026)
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 23, 2026 · Artificial Intelligence

How COMI Achieves 32× Compression and Boosts Performance by 25 Points

The COMI framework introduces a marginal information gain metric and a coarse‑to‑fine two‑stage compression strategy that preserves relevance and diversity, enabling 32× context reduction while improving Exact Match on NaturalQuestions by nearly 25 points and more than doubling inference speed.

Context CompressionLong-Context RetrievalMarginal Information Gain
0 likes · 7 min read
How COMI Achieves 32× Compression and Boosts Performance by 25 Points
Wuming AI
Wuming AI
Jan 29, 2026 · Artificial Intelligence

How to Compress Long LLM Conversations with Smart Summarization and Sliding Window

This article explains how to keep essential information from lengthy AI chat histories by using an intelligent summarization prompt, injecting the summary as a system message, and applying a sliding‑window strategy that retains the last three exchanges, thereby reducing token cost and preserving context continuity.

CContext CompressionLLM
0 likes · 11 min read
How to Compress Long LLM Conversations with Smart Summarization and Sliding Window
Architect
Architect
Jan 28, 2026 · Artificial Intelligence

How to Build a Reliable Long-Term Memory System for AI Agents

Designing a robust AI memory for long-running agents requires separating context from persistent storage, using markdown files, pre‑compaction flushing, hybrid vector‑BM25 retrieval, session pruning, and rebuildable SQLite indexes, ensuring explainable, editable, and portable recall while preventing context bloat and security leaks.

AI memoryClawdbotContext Compression
0 likes · 19 min read
How to Build a Reliable Long-Term Memory System for AI Agents
PaperAgent
PaperAgent
Jan 28, 2026 · Artificial Intelligence

How Clawdbot Achieves Persistent, Local Memory for LLM Agents

Clawdbot implements a fully local, persistent memory system for LLM agents by storing context and long‑term knowledge in editable Markdown files, indexing them with SQLite‑vec and FTS5, supporting multi‑agent isolation, compression, pruning, and configurable session lifecycles to maintain efficient, cost‑effective interactions.

Context CompressionLLM agentslocal storage
0 likes · 13 min read
How Clawdbot Achieves Persistent, Local Memory for LLM Agents
AI Engineering
AI Engineering
Jan 18, 2026 · Artificial Intelligence

Why a Single For Loop Powers BU’s Open‑Source Agent Framework

The BU Browser Use team open‑sourced bu‑agent‑sdk, a minimal LLM agent framework that treats the agent as a simple for‑loop and adds explicit done tools, context compression, ephemeral messages, and a unified LLM interface, enabling flexible, low‑overhead AI applications.

Agent FrameworkContext CompressionLLM
0 likes · 7 min read
Why a Single For Loop Powers BU’s Open‑Source Agent Framework
AI Insight Log
AI Insight Log
Dec 27, 2025 · Industry Insights

Why AI Code Generators Like Cursor Could Trigger an Infinite Software Crisis – Lessons from Netflix

Netflix senior engineer Jake Nations warns that the rise of AI‑powered code generators creates an "Infinite Software Crisis" by turning easy code generation into unmaintainable complexity, and outlines a three‑step "Context Compression" method to keep development disciplined and understandable.

AI Code GenerationContext CompressionNetflix engineering
0 likes · 10 min read
Why AI Code Generators Like Cursor Could Trigger an Infinite Software Crisis – Lessons from Netflix
High Availability Architecture
High Availability Architecture
Dec 26, 2025 · Artificial Intelligence

Why AI-Generated Code Threatens Understanding: A Netflix Engineer’s Three‑Stage Method

In a Netflix talk, senior engineer Jake Nations reveals how AI can instantly produce code yet leave developers clueless, explains the historic software crisis, distinguishes essential from accidental complexity, and outlines a three‑stage "context compression" process to keep speed without sacrificing comprehension.

AIContext CompressionNetflix
0 likes · 20 min read
Why AI-Generated Code Threatens Understanding: A Netflix Engineer’s Three‑Stage Method
DataFunTalk
DataFunTalk
Oct 20, 2025 · Artificial Intelligence

How DeepSeek-OCR Achieves 10× Context Compression with Vision Tokens

DeepSeek-OCR, a newly open‑sourced 3B‑parameter OCR model, uses a novel DeepEncoder and a 3B MoE decoder to compress long‑text contexts into visual tokens, achieving up to 10× compression with 97% accuracy and demonstrating strong practical performance on benchmarks and multilingual documents.

Context CompressionDeepSeekOCR
0 likes · 11 min read
How DeepSeek-OCR Achieves 10× Context Compression with Vision Tokens
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 9, 2025 · Artificial Intelligence

Tackling Real‑World Challenges in Multi‑Agent React: From ToolCalls to Context Compression

This article analyzes production‑grade issues of a multi‑agent React framework—such as long ToolCall latency, context bloat, missing intermediate states, loop control, and supervision gaps—and presents concrete XML‑based tool‑call prompts, context‑compression techniques, summary tools, and a plug‑and‑play MCP supervisor that together improve performance, reliability, and user‑facing output quality.

AI PlanningContext CompressionReAct pattern
0 likes · 16 min read
Tackling Real‑World Challenges in Multi‑Agent React: From ToolCalls to Context Compression
Architecture and Beyond
Architecture and Beyond
Sep 6, 2025 · Artificial Intelligence

How AI Agents Manage Context: Compression Strategies from Manus, Claude Code, and Gemini CLI

This article examines the context explosion problem in AI agents and compares three distinct compression approaches—Manus's never‑lose philosophy, Claude Code's aggressive 92% threshold with eight‑section summaries, and Gemini CLI's balanced 70% trigger with curated history—highlighting their trade‑offs in performance, cost, and reliability.

AIAgent DesignContext Compression
0 likes · 19 min read
How AI Agents Manage Context: Compression Strategies from Manus, Claude Code, and Gemini CLI
Instant Consumer Technology Team
Instant Consumer Technology Team
Jul 24, 2025 · Artificial Intelligence

Inside Claude Code: How a Local AI Agent OS Was Reverse‑Engineered

An in‑depth reverse‑engineering of Anthropic’s Claude Code reveals its multi‑agent architecture, real‑time steering queue, and novel context‑compression engine, exposing the 50k+ line obfuscated JavaScript core, the Agent scheduling layers, tool ecosystem, and storage system that power this local AI coding assistant.

AI AgentsClaude CodeContext Compression
0 likes · 11 min read
Inside Claude Code: How a Local AI Agent OS Was Reverse‑Engineered