Artificial Intelligence 11 min read

How Claude Code, Codex, and OpenCode Can Cut Token Usage by Up to 80%

The article breaks down token billing, shows that input tokens account for 70‑90% of cost, and provides concrete techniques—file filtering, context compression, doc‑driven prompts, memory caching, plan mode, output trimming, and model switching—across Claude Code, Codex, and OpenCode, culminating in a 10‑step checklist and a comparison table that demonstrate up to 80% token savings.

Java Tech Enthusiast

Jun 8, 2026

How Claude Code, Codex, and OpenCode Can Cut Token Usage by Up to 80%

Token billing model

Total cost = Input Tokens × input price + Output Tokens × output price. Input tokens (commands, conversation history, project files, tool output, system prompts) account for 70‑90% of total consumption, making them the primary optimization target.

Claude Code optimization methods

File filtering : create a .claudeignore (gitignore syntax) in the project root to exclude directories such as node_modules/, dist/, build/, caches, logs, etc. Example: a single interaction drops from 150 k tokens to 60 k (≈60% reduction).

Context compression : run /compact manually at logical checkpoints; use flags like

/compact 保留代码修改与文件路径，丢弃分析过程

to keep only diffs and paths. Enable automatic compression with /config Auto-compact enabled. Example: 25 k → 3 k tokens (≈88% reduction).

Document‑driven prompts : add a CLAUDE.md at the repository root that describes the tech stack, directory layout, and common commands (e.g., Next.js 14 + TypeScript + Prisma + PostgreSQL). This avoids exploratory cat/find/grep reads and saves >30% of input tokens.

Memory management : store recurring information with /memory (e.g.,

/memory 项目用 Next.js 14 + TypeScript，接口规范见 docs/api.md

). Retrieve with /memory list to avoid repeated pasting, cutting >40% of redundant input.

Plan mode : press Shift+Tab to let the model generate an execution plan first; confirm before execution. Reduces trial‑and‑error tokens by >20%.

Output trimming : enable “compact tool output” via /config to strip ANSI colors, progress bars, empty lines, and truncate long logs to error stacks only. Example: npm test output shrinks from 25 k to 2.5 k tokens (≈90% reduction).

Model switching : select a cheaper model for simple tasks ( /model haiku) and a more capable one for complex tasks ( /model sonnet or /model opus). Per‑task cost can drop 30‑80%.

Codex (GitHub Copilot) optimization methods

Limit context files : in VS Code set GitHub Copilot → Max File Context to 3‑5 files, restricting automatic project scanning. Input tokens decrease by >50%.

Command brevity : replace verbose natural‑language prompts with short comment directives, e.g. // Node.js Express 登录接口 JWT bcrypt instead of a full sentence. Input tokens decrease by >40%.

Disable unnecessary features : turn off real‑time suggestions, auto‑completion, and multi‑file indexing except when explicitly needed. Reduces background token consumption.

File‑by‑file development : keep each feature in a single file to avoid large cross‑file context. Context size can shrink by >60%.

OpenCode (self‑hosted) optimization methods

Precise context limits : edit config.json to set input_limit and output_limit according to the chosen model. Example snippet:

{
  "model": {
    "name": "deepseek-v3",
    "input_limit": 128000,
    "output_limit": 80000
  }
}

Accurate limits avoid automatic truncation and save >30% of tokens.

File filtering : create a .opencodeignore (same syntax as .gitignore) to exclude dependencies, build artifacts, logs, and binary resources.

Manual context clearing : run /clear periodically to reset history, and start new sessions for unrelated tasks. Prevents history bloat and saves >50% of invalid context.

Model selection : use low‑cost open‑source models (Qwen 7B, Llama 3 8B) for simple tasks and higher‑capability models (DeepSeek V3, Qwen Max) for complex work. Unit price can drop 70‑95%.

Comparison of saving dimensions

File filtering : Claude Code 60‑80%, Codex 50%+, OpenCode 60‑80%.

Context compression : Claude Code 50‑88%, Codex relies on short commands & split files, OpenCode 50‑80% via manual clear.

Document‑driven : Claude Code 30‑50%, Codex none, OpenCode 30‑50% with custom README.

Memory solidification : Claude Code 40‑60%, Codex manual copy‑paste, OpenCode 40‑60% via config.

Plan mode : Claude Code 20‑40%, Codex manual task breakdown, OpenCode custom scripts.

Output trimming : Claude Code 70‑90%, Codex short output commands, OpenCode configurable filters.

Model switching : Claude Code 30‑80%, Codex manual plugin swap, OpenCode dynamic config.

Context limit management : Claude Code auto‑managed via /config, Codex fixed IDE setting, OpenCode precise config.json (30%+ saving).

Practical 10‑step token‑saving checklist

Create .claudeignore or .opencodeignore in the project root and copy the provided template.

Add CLAUDE.md (or README_OPENCODE.md) describing the tech stack, directory structure, and common commands.

Enable automatic compression in Claude with /config Auto-compact enabled.

For long conversations, run /compact at logical checkpoints and clear history when needed.

Store recurring project configuration using /memory to avoid re‑typing.

For complex tasks, activate Plan Mode ( Shift+Tab) to plan before execution.

Switch models per task: use low‑cost models (Haiku, Qwen 7B) for simple work, higher‑tier models (Sonnet, DeepSeek V3) for complex work.

Disable unnecessary auto‑features such as real‑time completion and full‑project scanning.

Develop in separate sessions or files to prevent history bloat.

Regularly check token usage with /usage (Claude) to locate new “black holes”.

Key reminders

Prioritize optimizing Input because it dominates cost.

Prefer over‑exclusion: excluded files can be manually pasted if needed, which is cheaper than automatic scanning.

Promptly clean up long dialogs and multi‑task histories to avoid context inflation.

Match the model to the task; avoid using high‑end models for trivial jobs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI coding prompt optimization Codex Claude Code OpenCode Token Reduction

Written by

Java Tech Enthusiast

Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.