Taming AI Coding Agents: A Powerful Development Workflow with Engineering Discipline
The article introduces Matt Pocock's open‑source "skills" collection for AI coding agents, shows how it embeds traditional engineering practices such as alignment, domain modeling, TDD, and architecture governance into reusable command sets, and walks through a complete partial‑refund feature implementation using these skills.
Overview
Matt Pocock, a well‑known figure in the TypeScript community, open‑sourced his personal set of AI coding .claude skills under the repository mattpocock/skills. The collection has quickly amassed 89K+ stars and 7.8K forks, demonstrating strong community interest.
These skills are not simple prompt templates; they are engineered workflows that bring core software‑engineering disciplines—requirement alignment, domain modeling, test‑driven development (TDD), and architecture governance—into the AI agent’s operation.
Getting Started
Installation is a single command: npx skills@latest add mattpocock/skills After selecting the desired skills and target agent (Claude Code, Codex, Cursor, Copilot, Windsurf, etc.), run the one‑time setup script: /setup-matt-pocock-skills The script asks for an issue tracker, label set, and documentation path, then configures the entire suite.
Skill Structure
Each skill is a directory containing a SKILL.md with YAML front‑matter (name, description) and a Markdown workflow body. Optional files include REFERENCE.md, EXAMPLES.md, or a scripts/ folder. The agent discovers skills via the front‑matter description.
Skills Classification
Engineering (11 skills)
/setup-matt-pocock-skills– one‑time configuration wizard. /triage – state‑machine issue workflow, labeling bugs/enhancements and moving issues through states such as needs‑triage, needs‑info, ready‑for‑agent, ready‑for‑human, wontfix. /grill-me – generic questioning engine; most popular skill (157K installs). /grill-with-docs – enhanced version that consults CONTEXT.md and docs/adr/, challenges terminology, performs pressure testing, cross‑checks code, updates terminology, and creates ADRs when decisions meet three criteria (hard to reverse, lacking context, real trade‑off). /tdd – full TDD loop (RED → GREEN → REFACTOR) with checklist ensuring tests verify behavior via public interfaces. /diagnose – disciplined debugging loop: reproduce → minimise → hypothesise → instrument → fix → regression‑test. /improve-codebase-architecture – applies Ousterhout’s deep‑module theory, uses a “deletion test” to identify shallow modules, then runs an interactive redesign via /grill-with-docs. /to-prd – synthesises a PRD from the current conversation and submits it as a GitHub Issue. /to-issues – splits a PRD into vertically sliced, independently deliverable issues. /zoom-out – provides a high‑level explanation of an unfamiliar code segment. /prototype – creates disposable prototypes for design validation.
Productivity (4 skills)
/caveman– ultra‑compressed response mode, reducing token usage by ~75%. /handoff – packages the current context into a handoff document for seamless continuation in another session or by another agent. /write-a-skill – scaffolds a new skill with proper YAML front‑matter and progressive disclosure.
Misc (4 tools)
/git-guardrails-claude-code– adds confirmation prompts before dangerous Git commands. /setup-pre-commit – installs Husky pre‑commit hooks for linting, formatting, type‑checking, and testing. /migrate-to-shoehorn – migrates as type assertions to @total-typescript/shoehorn. /scaffold-exercises – creates a teaching‑exercise directory structure.
Complete Development Loop Example
Phase 1: Requirement Alignment
Input: "Our order system needs partial refunds; currently only full refunds are supported." The agent runs /grill-with-docs, scans CONTEXT.md for existing terms ( Order, Refund, LineItem), and asks clarifying questions about terminology, granularity, inventory handling, and payment‑gateway interaction.
Answers update CONTEXT.md (adding PartialRefund) and generate an ADR when a decision meets the three‑condition threshold.
Phase 2: Task Splitting
Running /to-prd creates a structured PRD and submits it as a GitHub Issue. Then /to-issues splits the PRD into vertical slices such as "partial refund per LineItem", "refund amount validation", and "refund status tracking".
Phase 3: Implementation (TDD)
For the "refund amount validation" issue, the agent starts /tdd. It plans the public interface, writes a failing test for over‑payment, creates the function refundService.requestPartialRefund(), implements the logic, and passes the test. The RED→GREEN→REFACTOR cycle repeats for additional edge cases.
Phase 4: Debugging
A production bug (duplicate refunds) triggers /diagnose. The agent guides the user through reproducing the issue, minimising the steps, hypothesising a race‑condition, instrumenting logs, fixing with a distributed lock, and adding a regression test.
Phase 5: Architecture Governance
After a week, the refund module shows bloated classes. /improve-codebase-architecture scans the code, uses the deletion test to flag RefundCalculator (shallow module) and tightly coupled validators. The agent proposes merging the validators, runs /grill-with-docs to redesign interfaces, and updates CONTEXT.md and ADRs accordingly.
Phase 6: Ongoing Tools
/zoom-out– quickly explains unfamiliar code. /prototype – builds throw‑away prototypes. /caveman – reduces token consumption. /handoff – hands off long tasks to another session.
Why the Methodology Works
The approach amplifies classic software‑engineering principles:
Requirement alignment mirrors the "programmer’s dilemma" from The Pragmatic Programmer – solved by /grill-me.
Unified language (Ubiquitous Language) from Domain‑Driven Design is enforced by /grill-with-docs updating CONTEXT.md.
Fast feedback loops from Extreme Programming are realized through the combined /tdd + /diagnose cycle.
Deep‑module theory from Ousterhout guides /improve-codebase-architecture to keep modules simple and interfaces clean.
By packaging decades of engineering discipline into reusable markdown‑based commands, the skills turn AI agents into disciplined collaborators rather than uncontrolled code generators.
FAQ Highlights
Skills work with multiple agents (Claude Code, Codex, Cursor, Copilot, Windsurf).
Compared to static .cursorrules, skills are interactive and stateful.
Minimal onboarding path: /setup-matt-pocock-skills → /grill-with-docs → /tdd.
Team sharing is natural because CONTEXT.md and ADRs live in the Git repository.
Creating a new skill only requires a SKILL.md with proper front‑matter; /write-a-skill scaffolds it.
References
GitHub repository: https://github.com/mattpocock/skills
Distribution platform: https://skills.sh/mattpocock/skills
Matt Pocock’s Skills Newsletter: https://www.aihero.dev/s/skills-newsletter
The Pragmatic Programmer (David Thomas & Andrew Hunt)
Domain‑Driven Design (Eric Evans)
A Philosophy of Software Design (John Ousterhout)
Extreme Programming Explained (Kent Beck)
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
BirdNest Tech Talk
Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
