PromptwireLive Standings
The Wire/Field guide
Field guide

Anthropic's playbook for Claude Code in million-line codebases

Anthropic published its observed best practices for running Claude Code across monorepos, legacy systems, and multi-repo enterprises. The headline argument: the harness around the model matters more than the model itself.

By the Promptwire desk·
Abstract layered blueprint with glowing copper pathways and teal nodes against a deep navy background.

Builders, integrators, prompt engineers · 3 min read

Anthropic's enterprise team has published a field note on what actually works when Claude Code is deployed against multi-million-line monorepos, decades-old legacy systems, and microservice sprawls. The piece is part of a new "Claude Code at scale" series, and it reads less like marketing and more like a postmortem of what their largest customers got right.

The central claim is one worth pausing on: the harness, not the model, determines performance in large codebases. Anthropic lays out five extension points plus two capabilities, and argues teams should adopt them roughly in this order.

The seven pieces

  • CLAUDE.md files — context files Claude reads at session start. Anthropic recommends a lean root file with pointers and gotchas, plus subdirectory files for local conventions. They load additively as Claude walks the tree.
  • Hooks — scripts that fire on events. The post argues their underrated use is self-improvement: a stop hook can propose CLAUDE.md updates while context is fresh; a start hook can dynamically load per-module setup. Linting and formatting are called out as belonging in hooks, not prompts.
  • Skills — packaged expertise loaded on demand via progressive disclosure. Skills can be scoped to specific paths so, for example, a payments team's deploy skill only activates inside that directory.
  • Plugins — bundles of skills, hooks, and MCP configs distributed via managed marketplaces, so a new hire gets the same setup on day one as veterans.
  • LSP integrations — surface the language server's "go to definition" and "find all references" to Claude for symbol-level precision. Anthropic cites an enterprise software customer that rolled LSP out org-wide before the Claude Code launch specifically to make C and C++ navigation reliable.
  • MCP servers — connections to internal tools, docs, ticketing, analytics. The most sophisticated teams reportedly expose structured search as an MCP tool.
  • Subagents — isolated Claude instances with their own context windows for exploration tasks that then hand findings back to the parent.

Why no index

The post takes a direct shot at RAG-based coding tools: embedding pipelines can't keep up with active engineering teams, so retrieval returns functions renamed two weeks ago or modules deleted last sprint. Claude Code instead does agentic search — grep, file traversal, reference following — against the live codebase on the developer's machine. The tradeoff is honest: without enough starting context, broad queries blow the context window before useful work begins.

Configuration patterns worth stealing

Anthropic flags three repeated patterns from successful deployments: keep CLAUDE.md lean and layered; initialize Claude in subdirectories, not the repo root (it walks up the tree automatically); and scope test and lint commands per subdirectory so Claude doesn't burn its window on irrelevant output. They note this is harder in compiled-language monorepos with deep cross-directory dependencies.

The post also recommends committing permissions.deny rules in .claude/settings.json so generated files, build artifacts, and vendored code are excluded uniformly across the team.

One sleeper line: Claude Code reportedly performs better than expected on C, C++, C#, Java, and PHP — languages teams don't usually associate with AI coding tools. We'd want to see independent benchmarks before taking that at face value, but it tracks with where enterprise engineering actually lives.

The through-line is that adoption at scale is an investment in setup, not a model upgrade. Teams hoping to skip the harness work and ride the next checkpoint are likely to be disappointed.