Author

AI Engineer Programming

In the AI era, defining problems is often more important than solving them; here we explore AI's contradictions, boundaries, and possibilities.

Articles

Likes

Views

Comments

Latest from AI Engineer Programming

74 recent articles

AI Engineer Programming

May 20, 2026 · Artificial Intelligence

Why Chunk‑Based RAG Fails and How IdeaBlocks Improve Retrieval

The article argues that the common assumption that text chunks are the proper knowledge unit in RAG pipelines is flawed, leading to versioning, metadata, and redundancy problems, and demonstrates that replacing chunks with structured IdeaBlocks dramatically reduces corpus size, token usage, and improves vector relevance.

IdeaBlockLLMRAG

0 likes · 10 min read

Why Chunk‑Based RAG Fails and How IdeaBlocks Improve Retrieval

AI Engineer Programming

May 18, 2026 · Artificial Intelligence

Designing an Agent Gateway: Bridging Business Logic and Protocol Infrastructure

The article analyzes why traditional API gateways cannot meet the needs of stateful Agentic workflows and proposes a dedicated Agent gateway that handles access control, cross‑service execution tracing, and pre‑LLM security enforcement while addressing connection overhead, session fan‑out, and observability challenges.

A2AAI securityAgent Gateway

0 likes · 14 min read

Designing an Agent Gateway: Bridging Business Logic and Protocol Infrastructure

AI Engineer Programming

May 17, 2026 · Fundamentals

Why Are We Still Using Markdown?

The article analyses Markdown's minimalist design, its ambiguous syntax, security flaws such as ReDoS and XSS vulnerabilities, and the growing gap between its original simple transliteration goal and the complex compiler‑like features developers now demand.

CommonMarkMarkdownReDoS

0 likes · 14 min read

AI Engineer Programming

May 17, 2026 · Artificial Intelligence

ReAct, Plan‑Execute, and Reflection: How Continuous Loops Make Agent Architecture Crucial

While a single LLM call is a stateless function, real‑world tasks require dynamic information gathering, hypothesis testing, and iterative refinement, so agents must operate in a continuous loop; the article analyzes core patterns such as ReAct, Plan‑Execute, Reflection, Multi‑Agent and HITL, highlighting state management, cost, debugging, and observability challenges.

Agent ArchitectureLLMObservability

0 likes · 21 min read

ReAct, Plan‑Execute, and Reflection: How Continuous Loops Make Agent Architecture Crucial

AI Engineer Programming

May 16, 2026 · Artificial Intelligence

How to Boost RAG Retrieval Quality: Real‑World Cost‑Benefit Analysis

This article examines practical ways to improve Retrieval‑Augmented Generation (RAG) retrieval quality—covering vector database choices, data chunking, embedding models, query expansion, and re‑ranking—while weighing performance gains against operational costs through multiple real‑world case studies.

LLMRAGRe‑ranking

0 likes · 16 min read

How to Boost RAG Retrieval Quality: Real‑World Cost‑Benefit Analysis

AI Engineer Programming

May 15, 2026 · Artificial Intelligence

Hybrid Retrieval in RAG: Combining BM25 Precision with Dense Vector Semantics

The article examines why pure vector retrieval in RAG lacks lexical precision and traceable relevance scores, explains BM25's strengths, and presents hybrid retrieval architectures—including RRF and linear combination fusion—as well as the trade‑offs of externalizing the fusion process.

BM25Hybrid SearchRAG

0 likes · 9 min read

Hybrid Retrieval in RAG: Combining BM25 Precision with Dense Vector Semantics

AI Engineer Programming

May 14, 2026 · Artificial Intelligence

RAG Retrieval: Comparing Bi-encoder and Cross-encoder Architectures

The article reviews the three‑step RAG pipeline, explains why retrieval quality hinges on fast, accurate semantic matching, contrasts Bi-encoder’s offline vector indexing and speed with Cross-encoder’s token‑level interaction and higher precision, and discusses hybrid solutions such as ColBERT and LLM rerankers with practical engineering guidelines.

Bi-EncoderColBERTCross-Encoder

0 likes · 10 min read

RAG Retrieval: Comparing Bi-encoder and Cross-encoder Architectures

AI Engineer Programming

May 13, 2026 · Artificial Intelligence

AI Agent Architecture Patterns: How to Choose the Right Solution for Your Workload

The article analyzes how AI agent architecture choices—single‑agent versus multi‑agent, ReAct, plan‑and‑execute, orchestrator‑worker, hierarchical teams, reflection, and HITL—affect cost, reliability, and scalability, providing quantitative trade‑offs and industry examples to guide workload‑specific selection.

AI AgentsArchitecture PatternsHuman-in-the-Loop

0 likes · 16 min read

AI Agent Architecture Patterns: How to Choose the Right Solution for Your Workload

AI Engineer Programming

May 12, 2026 · Artificial Intelligence

Should You Build the Agent Framework First, Then Fine‑Tune System Prompts?

The article explains what a System Prompt is, how it differs from User Prompts, its role in LLM APIs, caching benefits, common pitfalls, and best‑practice designs across Claude Code, Cursor, Codex CLI, and Gemini CLI, ending with testing and version‑control recommendations.

AI AgentsCacheClaude Code

0 likes · 19 min read

Should You Build the Agent Framework First, Then Fine‑Tune System Prompts?

AI Engineer Programming

May 11, 2026 · Artificial Intelligence

Why Your Agent Isn’t Stupid—It’s Just Lost in the Middle of the Context

Adding dozens of MCP tools overloads the LLM’s context window, causing the “lost in the middle” effect that degrades accuracy, but a gateway with semantic tool discovery, role‑based virtual servers, and pre‑filtering can restore performance while preserving governance.

Agent ArchitectureLLMMCP

0 likes · 15 min read

Why Your Agent Isn’t Stupid—It’s Just Lost in the Middle of the Context