Why Anthropic Caps SKILL.md at Under 5K Tokens and How to Structure Yours

The article explains Anthropic's official 5K‑token limit for SKILL.md files, breaks down the three‑level loading architecture, demonstrates progressive disclosure with concrete token calculations, and provides a step‑by‑step refactoring guide that reduces token usage while improving skill accuracy.

Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Why Anthropic Caps SKILL.md at Under 5K Tokens and How to Structure Yours

1. Official Token Limit

Anthropic’s documentation (Agent Skills page) defines a hard limit: the main SKILL.md body must stay under 5 000 tokens . This corresponds to roughly 3 500–4 000 Chinese characters, 400–600 markdown lines. Exceeding this limit (e.g., a 2 000‑line file ≈ 12 000 tokens) violates the API validation.

2. Three‑Level Loading Model

The system splits a Skill into three Levels:

Level 1 (Metadata) : name + description, loaded at startup, ~100 tokens per Skill.

Level 2 (Instructions) : the SKILL.md body, loaded when the Skill is triggered, must be under 5 k tokens.

Level 3+ (Resources) : referenced files or scripts, loaded on demand, effectively unlimited token cost.

Only Level 1 and Level 2 count toward the hard token budget; Level 3 files are never read unless explicitly needed.

3. Progressive Disclosure

Anthropic calls this staged loading “Progressive Disclosure”: the model only consumes context when required, preserving attention for the current task. This mirrors UI design where primary options are visible and advanced settings are hidden.

4. Practical Refactoring Example

The author’s own SKILL_main.md evolved through three versions:

Version 1 (Anti‑pattern) : a single 1 900‑line file (~14 k tokens) mixing writing rules, templates, scripts, causing token overflow and cross‑domain interference.

Version 2 : split into four domain‑specific files, still >1 k lines each, but without reorganizing internal structure.

Version 3 (Current) : a 612‑line main file containing only three elements—trigger conditions, workflow skeleton, and reference index—while all detailed rules, templates, and scripts reside in separate referenced files.

After refactoring, initial token injection dropped from ~14 k to ~4.5 k, and end‑to‑end workflow token usage fell from ~28 k to ~12 k, while trigger‑accuracy improved because irrelevant details no longer polluted the context.

5. When to Split

A practical guideline based on line count (approximate token range):

< 200 lines (<2 k tokens): healthy – no split needed.

200‑500 lines (2‑5 k tokens): approaching limit – consider extracting sub‑files.

500‑800 lines (5‑8 k tokens): over limit – start splitting.

> 800 lines: severely over – refactor immediately using Progressive Disclosure.

Prioritise splitting in this order: first move examples and templates, then detailed rules, and finally long processes into scripts.

6. Good vs. Bad References

Effective reference syntax includes a brief description of the file’s purpose, e.g.:

## 第三步:审稿
  按文风规范跑一遍审稿,对照标准见 references/style-guide.md。
  关键检查项:开头是否有面试场景、是否有"我是吴师兄"开头收尾、字数是否在 4000‑5000 区间。

Bad references merely list a filename, leaving the model unsure whether to read it:

## 第三步:审稿
  详见 references/style-guide.md。

Avoid circular references between sub‑files; each should be indexed only by the main SKILL.md.

7. Interview Answer Blueprint

When asked about SKILL.md size in an interview, follow three steps:

State the official numbers (Level 1 ≈ 100 tokens, Level 2 < 5 k tokens, Level 3 unlimited).

Explain Progressive Disclosure as a token‑budgeting strategy.

Describe the concrete refactoring process (example: reducing a 1900‑line file to 612 lines, cutting token usage by ~70 % and improving accuracy).

8. Skills as a New Engineering Capability (2026)

Building Skills is akin to traditional software modularisation: define clear contracts, isolate responsibilities, prevent circular dependencies, and manage resource budgets (tokens instead of CPU time). Mastery of Skills demonstrates architectural judgment beyond simple prompt‑writing.

Claude Skills 官方三层加载机制 · 各 Level 的 token 成本与加载时机
Claude Skills 官方三层加载机制 · 各 Level 的 token 成本与加载时机
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI EngineeringClaudeToken ManagementAnthropicProgressive DisclosureSKILL.md
Wu Shixiong's Large Model Academy
Written by

Wu Shixiong's Large Model Academy

We continuously share large‑model know‑how, helping you master core skills—LLM, RAG, fine‑tuning, deployment—from zero to job offer, tailored for career‑switchers, autumn recruiters, and those seeking stable large‑model positions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.