9 Hard‑Earned Lessons from Anthropic Engineers on Building Claude Code Skills

Anthropic engineers share a detailed, experience‑driven guide that categorises Claude Code Skills into nine types, explains why Skills are folders, highlights the importance of Gotchas, flexible prompts, description triggers, memory, hooks and team distribution, and provides concrete examples for each.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
9 Hard‑Earned Lessons from Anthropic Engineers on Building Claude Code Skills

Skills are folders, not markdown files

The first key shift is to treat a Skill as a directory rather than a single SKILL.md file. The folder can contain scripts, assets, reference documents and configuration files, making the whole file system part of the context engineering.

For example, if a Skill needs to output a Markdown report, place a template in an assets/ directory and let Claude copy and fill it. Function signatures and usage examples can be stored in references/api.md for Claude to read on demand. This "progressive disclosure" tells Claude which files exist and lets it read them at the appropriate moment instead of stuffing everything into the prompt.

9 categories of Skills, unpacked

Anthropic engineers grouped all internal Skills into nine classes. Many engineers only cover Knowledge and Scaffolding, while Verification and Operations often deliver the highest value.

Knowledge & Reference : Guides Claude on correct usage of internal libraries, CLIs or SDKs. Example Skills: billing-lib (edge cases of the billing library), internal-platform-cli (sub‑command usage), frontend-design (design‑system understanding).

Verification : Describes how to test or verify code, typically using Playwright, tmux, etc. Example Skills: signup-flow-driver (automated registration‑onboarding flow with assertions), checkout-verifier (Stripe test‑card checkout verification), tmux-cli-driver (TTY‑based CLI testing).

Data Access : Connects to data stores and monitoring systems. Example Skills: funnel-query (joins events to trace registration‑to‑payment paths), cohort-compare (retention or conversion comparison with statistical significance), grafana (dashboard UID and cluster lookup table).

Automation : Compresses repetitive operations into a single command, often depending on other Skills. Example Skills: standup-post (aggregates tickets, GitHub activity, Slack messages into a formatted daily report), create‑ticket (schema validation and post‑creation workflow), weekly-recap (merged PRs, closed tickets and deployment records into a weekly summary).

Scaffolding : Generates boilerplate for specific modules. Example Skills: new‑workflow (annotated new service/workflow scaffold), new-migration (database migration template with common pitfalls), create-app (pre‑configured internal app template with auth, logging and deployment settings).

Code Review : Executes quality checks, possibly as Git hooks or GitHub Actions. Example Skills: adversarial-review (dedicated sub‑agent critiques and iterates until only minor issues remain), code-style (enforces style rules Claude normally skips), testing-practices (describes how to write and what to test).

Deploy : Pulls, pushes and deploys code, sometimes invoking other Skills for data collection. Example Skills: babysit-pr (monitors PR, retries flaky CI, resolves merge conflicts, enables auto‑merge), deploy‑ (build → smoke test → gray‑release with error‑rate comparison → automatic rollback), cherry-pick-prod (isolated worktree, cherry‑pick, conflict handling, PR template).

Debugging : Receives symptoms (Slack messages, alerts, error signatures) and runs a multi‑tool investigation workflow, outputting a structured report. Example Skills: -debugging (maps "symptom → tool → query" for high‑traffic services), oncall-runner (fetches alerts, checks common suspects, summarises findings), log-correlator (given a request ID, pulls logs from all involved systems).

Operations : Executes routine maintenance, especially destructive actions, with protective steps to enforce best practices. Example Skills: -orphans (finds isolated pods/volumes, posts to Slack, cool‑down, user confirmation, cascade cleanup), dependency-management (organization's dependency‑approval workflow), cost-investigation (investigates sudden storage/traffic cost spikes with bucket and query details).

Gotchas are the highest‑value part of a Skill

Each Skill should contain a "Gotchas" section that records real failures Claude has encountered. Updating this section whenever a new failure mode appears keeps the Skill valuable, because Claude already knows a lot about generic code and libraries; the skill must push Claude’s default behaviour toward your specific constraints.

Write flexible commands, not overly specific ones

Over‑specific instructions make Skills brittle. Instead, provide enough information for Claude to act while leaving room for adaptation. For user‑input Skills (e.g., posting a daily report to Slack), store configuration in a config.json file: if the config is missing, Claude asks the user; if present, it proceeds automatically. Use the AskUserQuestion tool for structured multi‑choice prompts.

The description field drives Skill activation

Claude reads every Skill's description at startup to build an index. The description should state *when* the Skill should be used, not just *what* it does. A well‑crafted description acts as a trigger, improving discoverability and reducing missed opportunities.

Give Claude code, let it focus on "what to do"

Place scripts and helper libraries inside the Skill so Claude only composes and decides, rather than rewriting boilerplate. For instance, a data‑analysis Skill can contain helper functions that fetch events; Claude then generates a short script that calls those helpers to answer questions like "What happened last Tuesday?".

Skills can have their own memory

Skills may store data in their directory—simple append‑only logs or even a SQLite database—giving them cross‑session memory. The standup-post Skill, for example, maintains a standups.log so the next run can report changes since the last execution. To avoid data loss during Skill upgrades, store persistent data in the stable directory provided by ${CLAUDE_PLUGIN_DATA}.

Hooks activate only when the Skill is called

Embedded Hooks run solely during Skill invocation and disappear after the session, making them ideal for "contextual safeguards". Examples include /careful (blocks dangerous commands like rm -rf, DROP TABLE, kubectl delete in production) and /freeze (restricts file modifications to specific directories during debugging).

Distributing Skills within a team

Two distribution methods exist: commit Skills to the repository’s .claude/skills folder for small teams, or publish them in an internal plugin marketplace for larger organisations. Anthropic relies on organic sharing—posting Skills to a GitHub sandbox, announcing them in Slack, and letting the community adopt them. Before release, filter out duplicate or low‑quality Skills, and manage dependencies manually by referencing other Skill names.

Monitoring Skill usage

Anthropic records each Skill’s invocation via a PreToolUse hook, enabling identification of popular Skills and detection of under‑used ones—often a sign of a poorly written description.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Operationsprompt engineeringClaudeAI automationCode Skills
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.