Loop Engineering: From Prompting Agents to Designing Autonomous Loops
Loop Engineering replaces manual prompting of AI coding agents with automated loops that schedule, coordinate, and verify work, detailing the five essential primitives, historical evolution, practical implementations, limitations, and a step‑by‑step guide for building a minimal, production‑ready loop.
Loop Engineering – From Prompting Agents to Designing Loops
Peter Steinberger’s claim, “You shouldn’t be prompting coding agents any more. You should be designing loops that prompt your agents,” sparked a wave of discussion in June 2026. Boris Cherny echoed the idea, saying his job is now to write loops that drive Claude Code. The chapter builds on Addy Osmani’s “Loop Engineering” framework, combining Cherny’s practice, Geoffrey Huntley’s Ralph Loop concepts, and AlphaSignal’s four‑condition test.
12.1 From Prompting Agents to Designing Loops
Cherny’s three‑stage evolution: initially he used an IDE with autocomplete, then ran 5‑10 Claude sessions manually, switching windows to issue prompts. He stopped manual prompting and wrote small programs (loops) that (1) discover work, (2) hand it to Claude, and (3) verify completion. Over 30 days his loops produced 100 % of Claude‑generated code, merging 259 PRs; Anthropic engineers reported an 8× increase in daily output and a 76 % success rate for open‑source tasks.
12.2 Where Loops Sit in the Stack
Previous chapters covered “how agents do work.” Loop Engineering answers “who prompts the agents.” The hierarchy is:
Loop Engineering (who prompts Agent)
└── Harness Engineering (Agent runtime environment)
└── Methodology layer (how Agent works)
└── Project layer (what to do)Loops add three capabilities on top of Harness: timers, sub‑agent spawning, and self‑driving decision logic.
12.3 The Five Essentials + State
Automations – give the loop a heartbeat so it runs on schedule rather than a one‑off prompt.
Worktrees – isolate parallel agent checkouts; Git worktree prevents file‑level collisions.
Skills – reusable knowledge units that replace re‑explaining the project each iteration.
Connectors / Plugins – let loops interact with real tools (issue trackers, databases, Slack, etc.).
Sub‑agents – separate agents for exploration, implementation, and verification, enabling maker‑checker separation.
A state file (markdown or Linear board) records what has been done and what remains, persisting progress across runs.
12.4 Dynamic Workflows – Deterministic Orchestration
Claude Code introduced Dynamic Workflows, a JavaScript‑based deterministic script that sequences sub‑agents (e.g., Fan‑out & Synthesize, Classify & Act, Pipeline, Tournament, Loop Until Done, Deep Verification). An example mines the last 50 sessions, extracts 86 corrective patterns, and produces a report.
12.5 Historical Evolution of Loops
2022 – ReAct loop (model → tool → read → repeat).
2023 – AutoGPT (goal‑driven self‑prompting, notorious for endless loops).
2025 – Ralph Loop (bash one‑liner feeding a fixed prompt file, anchored by cheap cost).
Spring 2026 – /goal productization (condition‑driven termination, separate evaluation model).
Now (2026) – Multi‑agent orchestrated loops with scheduling, supervision, and persistent state.
12.6 A Complete Loop Example
Every morning an automation runs: it reads CI failures, open issues, and recent commits, writes findings to a markdown or Linear board, spawns isolated worktrees, drafts fixes via a sub‑agent, reviews drafts with another sub‑agent, opens PRs, updates tickets, and records progress in the state file.
12.7 Cherny’s Real‑World Loops
PR babysitter – monitors open PRs, fixes CI failures, merges safe changes.
CI health – reproduces and fixes flaky/broken tests.
Feedback clustering – aggregates Twitter feedback every 30 minutes.
Idea mining at scale – runs hundreds of Claude instances to surface actionable ideas.
He also cites Jarred Sumner’s “Robo Bun” loop that automates bug reproduction, test creation, code fixing, PR opening, and proof‑gate verification.
12.8 Steinberger’s Loop Practices
Steinberger’s rule: “Whenever you find yourself repeatedly observing, deciding, routing, or verifying for an agent, build a tool that hands that work to the agent.” He splits loops into a vision.md (project constitution) and agents.md (invariants). Examples include issue/PR reaper, maintainer report, video‑driven bug‑fix loop, and auto‑review triggered by agents.md.
12.9 What Loops Can’t Do
Verification still on you
Maker‑checker separation and proof gates mean the final “done” still requires human confirmation.
Comprehension debt
Fast loops can widen the gap between code in the repo and the engineer’s understanding.
Cognitive surrender
Relying on loops without critical judgment leads to accepting any output.
Runaway loops
AutoGPT’s endless self‑prompting illustrates the danger of missing gates.
12.10 Do You Really Need a Loop?
AlphaSignal’s four‑condition test:
Task repeats regularly – amortizes loop setup cost.
Automated verification – tests, linters, builds must reject bad work.
Token budget can absorb waste – loops burn tokens for retries.
Agent already has engineering‑grade tooling – logs, reproducible environments.
If all are true, building a loop is worthwhile; otherwise, automate only ready‑for‑automation work.
12.11 Minimal Viable Loop
Four components:
One automation ( /loop or /goal).
One Skill ( SKILL.md) storing project context.
One state file (markdown or board) tracking progress.
One gate (test suite, type check, build) that rejects bad results.
Order: run manually → convert to Skill → wrap in Loop → schedule.
Hands‑on: CI Health Loop
1. Create .claude/skills/ci-health.md with description and rules.
---
name: ci-health
description: Check CI status, fix failing tests, isolate flaky tests
---
## Project context
- test command: npm test
- lint command: npm run lint
- CI config: .github/workflows/ci.yml
## Rules
- only fix failures with clear error messages
- mark flaky tests with @flaky, do not delete
- never touch auth or payment code
- after fix, run full test suite2. Create ci-health-state.md to record pending and completed items.
# CI Health Loop — State
## Pending
<!-- Loop writes discoveries here -->
## Completed
<!-- PR numbers move here after fix -->3. Start the loop in Claude Code:
/loop
Prompt: Use ci-health skill to check CI. Read ci-health-state.md for prior work. Fix failing tests, run tests to confirm. Unfixable items go to pending section. Update state file before exit.
Interval: 10 minutes4. The gate is the test suite itself – a fix is only accepted if the suite passes.
12.12 “It’s Just a Cron Job with a Hat On”
Loops run on cron‑like schedules but differ because each trigger runs a model that decides the next step, checks results, and may continue or stop. The decision logic is dynamic, not a fixed script.
12.13 Loop Engineering in the Book’s Methodology
Loops sit atop Harness, use Skills (Chapter 2), Gates (Chapter 3), Ralph Loop (Chapter 4), gstack (Chapter 5), autoresearch (Chapter 7), Goal Workflow (Chapter 8), and Kanban (Chapter 11) to form a complete AI‑driven development system.
12.14 Chapter Summary
The five primitives – Automations, Worktrees, Skills, Connectors, Sub‑agents – plus persistent state form a trustworthy autonomous loop. The real challenge is giving the loop a reliable “no” (a gate). Without verification, a loop becomes a token‑burning furnace rather than a productive automation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
BirdNest Tech Talk
Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
