Author

AI Tech Publishing

In the fast-evolving AI era, we thoroughly explain stable technical foundations.

Articles

Likes

Views

Comments

Latest from AI Tech Publishing

81 recent articles

AI Tech Publishing

May 1, 2026 · Artificial Intelligence

5 Counterintuitive Design Principles for Prompt Caching in Claude Code

The article details five counterintuitive design principles for Claude Code's prompt caching—optimizing prompt layout, using message‑based updates, never switching models or tools mid‑conversation, safely compressing context, and monitoring cache health—backed by concrete examples and up to 90% cost savings.

AI EngineeringCache OptimizationClaude Code

0 likes · 10 min read

5 Counterintuitive Design Principles for Prompt Caching in Claude Code

AI Tech Publishing

May 1, 2026 · Artificial Intelligence

Turning Harness into a Distributed Context Management System for Long‑Task Agents

The article explains why the reliability of long‑task agents now hinges on harness design rather than model strength, and details four harness innovations—programmatic tool calls, sub‑agents as isolation boundaries, context compression, and skill‑search priority—that Glean uses to build a distributed context management system.

Agent HarnessContext CompressionSub‑agents

0 likes · 11 min read

Turning Harness into a Distributed Context Management System for Long‑Task Agents

AI Tech Publishing

Apr 29, 2026 · Artificial Intelligence

Who Tests When AI Generates 99% of Code? Inside a Self‑Repairing Agent Harness

The article explains how a self‑repairing Agent Harness replaces traditional QA by looping evaluation, triage, automated fixing, verification and AI‑gated canary release, using a three‑judge reviewer, model‑based sampling and six daily engineering tasks to keep AI‑driven products reliable.

AI agentsAI-driven QAContinuous Deployment

0 likes · 16 min read

Who Tests When AI Generates 99% of Code? Inside a Self‑Repairing Agent Harness

AI Tech Publishing

Apr 29, 2026 · Artificial Intelligence

Why Do AI Agents Forget and Hallucinate? A Complete Guide to KV‑Cache Memory Mechanisms

The article explains that AI agents’ forgetting and hallucinations stem from token‑level attention scores causing key‑value cache eviction before retrieval, then surveys KV‑cache basics, naive growth, streaming‑LLM windowing, SnapKV’s attention‑guided compression, token‑retention studies, Memory Sparse Attention, compares these methods, and discusses practical system pitfalls and design implications.

AI agentsKV CacheMemory Sparse Attention

0 likes · 20 min read

Why Do AI Agents Forget and Hallucinate? A Complete Guide to KV‑Cache Memory Mechanisms

AI Tech Publishing

Apr 27, 2026 · Artificial Intelligence

Why Build Your Own AI Evaluation Harness? 7 OpenAI‑Inspired Recommendations

The article explains why generic AI testing platforms fall short, outlines how to design a testable AI system from day one, and presents seven practical recommendations—from using Codex or Claude Code to manage regression and iteration test sets, to leveraging entropy diagnostics and custom domain‑expert UX.

AI evaluationEvaluation FrameworkOpenAI

0 likes · 8 min read

Why Build Your Own AI Evaluation Harness? 7 OpenAI‑Inspired Recommendations

AI Tech Publishing

Apr 27, 2026 · Artificial Intelligence

Context Window Strategies in Agent Harnesses: Pi, OpenClaw, Claude Code, Letta, Alyx

The article analyzes how five Agent Harness frameworks—Pi, OpenClaw, Claude Code, Letta, and Alyx—handle context windows, file pagination, tool result limits, session pruning, and sub‑agent isolation, revealing convergent design patterns that treat the context as a managed memory system.

Agent HarnessContext ManagementFile Pagination

0 likes · 21 min read

Context Window Strategies in Agent Harnesses: Pi, OpenClaw, Claude Code, Letta, Alyx

AI Tech Publishing

Apr 25, 2026 · Artificial Intelligence

A Comprehensive Guide to Harness Engineering for Reliable AI Agents

This article systematically breaks down Harness Engineering—a framework that organizes large models, context, tools, state, sandboxing, security, and evaluation into a reliable AI agent engineering system, showing how to move agents from demo to production.

AI agentsContext ManagementHarness Engineering

0 likes · 21 min read

A Comprehensive Guide to Harness Engineering for Reliable AI Agents

AI Tech Publishing

Apr 23, 2026 · Artificial Intelligence

API vs CLI vs MCP: How Claude Guides Their Collaboration for Production‑Grade Agents

The article compares three ways agents connect to external systems—direct API calls, CLI tools, and the Model Context Protocol (MCP)—and explains how MCP provides a standardized, scalable layer with rich semantics, authentication, and context‑saving techniques that enable production‑grade cloud agents.

AI agentsMCPProtocol

0 likes · 16 min read

API vs CLI vs MCP: How Claude Guides Their Collaboration for Production‑Grade Agents

AI Tech Publishing

Apr 22, 2026 · Artificial Intelligence

Why Longer Context Makes LLMs Forget Faster: 7 Failure Modes and Memory System Solutions

The article analyzes how extending the context window of large language models leads to rapid forgetting, outlines seven concrete failure modes, examines cognitive‑science‑based memory architectures, and walks through practical layers—from Python lists to markdown files to vector retrieval—highlighting why simple context expansion alone cannot solve the problem.

Agent DesignLLM MemoryVector Retrieval

0 likes · 10 min read

Why Longer Context Makes LLMs Forget Faster: 7 Failure Modes and Memory System Solutions

AI Tech Publishing

Apr 21, 2026 · Artificial Intelligence

Why Your AI Agent Stays a Toy: Six Production‑Readiness Gaps and How to Bridge Them

Moving an AI agent from a controlled demo to an unattended production environment introduces six critical gaps—fault handling, state persistence, observability, credential security, cost control, and human supervision—each requiring specific infrastructure, practices, and a comprehensive readiness checklist to avoid costly failures.

AI agentsCost ManagementObservability

0 likes · 15 min read

Why Your AI Agent Stays a Toy: Six Production‑Readiness Gaps and How to Bridge Them