23 min read

Can Claude Code Handle Million‑Line Codebases? Why the Harness Beats the Model

The article breaks down seven common pitfalls when using Claude Code on massive codebases, explains Anthropic’s agentic‑search approach, and shows how a well‑designed harness—including concise CLAUDE.md files, LSP integration, subdirectory launches, hooks, skills, plugins, and MCP servers—outperforms simply upgrading the model.

IT Services Circle

May 27, 2026

Can Claude Code Handle Million‑Line Codebases? Why the Harness Beats the Model

When Claude Code is run on small personal projects it feels smooth, but in enterprise‑scale repositories with millions of lines of code the tool quickly runs into context‑overflow issues. Anthropic’s official response is that the problem isn’t the model size but how Claude searches for code.

Q1: Is the context overflow caused by a small model?

Anthropic says swapping models won’t help; the bottleneck is Claude’s ability to locate the right code. Opus 4.7 can handle a 1 M‑token window (about two million words), yet even that cannot fit an entire multi‑million‑line repository, so a different strategy is required.

Anthropic calls their approach agentic search : Claude behaves like a real engineer, listing directories ( ls), entering folders ( auth/), grepping for keywords, and iteratively reading files until it finds the target.

Three reasons justify this design:

Indexes expire – frequent commits make embedding pipelines stale, so a static vector index can return outdated symbols.

Cold‑start cost is near zero – building a RAG index on a million‑line codebase takes minutes, whereas Claude Code works immediately.

Exact matching matters – vector similarity returns related functions (e.g., getUserByName) that are not the exact target.

The key insight is that a good harness matters as much as the model itself.

Q2: How long should CLAUDE.md be?

Anthropic recommends each CLAUDE.md file stay under 200 lines. The file is loaded into Claude’s context each time, so longer files cause the model to ignore instructions more often. For large projects, split the configuration hierarchically: a root file contains only high‑level policies, while each subdirectory has its own CLAUDE.md with module‑specific rules.

Maintain the file as a living document: regularly ask, “If I delete this line, will Claude still follow the rule?” If the answer is yes, remove the line. Review the file every 3–6 months because model upgrades can render old rules obsolete.

Q3: Why does Claude often pick the wrong file when searching for a function?

In multi‑language codebases, plain‑text grep returns thousands of matches. By integrating the Language Server Protocol (LSP), Claude can search by symbol instead of string, dramatically reducing false positives. Installing the appropriate language server (e.g., typescript‑language‑server, pyright, rust‑analyzer) takes only a few minutes.

Launching Claude from the relevant subdirectory (e.g., cd services/payments && claude) ensures the context loads the nearest CLAUDE.md first, keeping the focus narrow and avoiding loading the entire repository.

Q4: How to handle massive multi‑file changes without crashes?

Anthropic advises splitting large refactors into multiple sessions and using a subagent to explore the codebase. The subagent works in its own context, produces a concise findings report, and the main agent then executes the plan. This prevents the main context from being exhausted.

For very large migrations, Claude Code provides a built‑in /batch tool that runs parallel subagents in separate Git worktrees, each producing a PR for review.

Q5: How to roll Claude Code out to a team?

First, turn repeatable tasks into skills (SOPs) that load on demand. Package skills, hooks, and LSP settings into a plugin so new team members can install a single bundle and get the same capabilities as power users.

Distribute plugins via an internal marketplace, allowing anyone to update to the latest best‑practice bundle.

Q6: How does founder Boris use Claude Code daily?

Boris runs multiple Claude instances in parallel, uses the /permissions command instead of bypassing safety checks, starts every complex task in Plan Mode before switching to auto‑accept, attaches a PostToolUse hook to auto‑format generated code, creates slash commands for frequent actions (e.g., /commit‑push‑pr), and keeps a shared CLAUDE.md in Git that is updated whenever Claude makes a mistake.

Q7: What projects are unsuitable for Claude Code?

Claude Code excels on "Git + engineer + standard directory" projects. It struggles with binary‑heavy game engines, non‑Git version control systems (Perforce, Subversion), and repositories primarily edited by non‑engineers (designers, product managers).

In such cases, Anthropic’s Applied AI team can provide custom integrations.

In summary, the three takeaways are:

Invest in a solid harness—CLAUDE.md, subdirectory launches, and LSP—before expecting the model to scale.

Keep CLAUDE.md under 200 lines, load Claude from the relevant subdirectory, and enable LSP for precise symbol search.

Use plans, subagents, plugins, skills, and a dedicated maintainer to keep the system evolving and reliable.

Images illustrating the concepts are included throughout the original article.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

hooks LSP skills large codebase Claude Code agentic search Harness Opus 4.7

Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.