Treating Claude as Engineering Infrastructure
Most teams bolt Claude on. The ones getting real value designed a system around it.
A few months ago I reviewed a PR that should have taken thirty minutes to understand. It was 600 lines across eight files, described as “refactored the integration layer.” The developer had used ChatGPT to implement it. When I asked about tests, there weren’t any. When I asked why the handler was now calling three services it hadn’t touched before, they weren’t sure — ChatGPT had structured it that way and it seemed to work.
That’s what bolting AI on looks like in practice. The individual got faster. The output got harder to review, harder to trust, and harder to maintain. The team absorbed the cost.
The teams I’ve seen get real value from AI-assisted development didn’t just adopt a new tool — they designed a system around it. The same discipline they bring to CI, code review, and onboarding. Not because Claude requires hand-holding, but because teams do.
The setup most teams skip
Claude has no institutional memory by default. Every session starts cold — no context about your codebase, no knowledge of your conventions, no awareness of the decisions made three months ago. Without deliberate setup, every engineer on the team is having a slightly different experience, and inconsistency is the dominant pattern.
The other thing teams skip: changing their process. If your existing workflow has weak test coverage, no architecture documentation, and code review that’s mostly a rubber stamp — adding Claude to that workflow doesn’t fix it. It accelerates it.
What effective teams do is treat AI workflow as infrastructure. It has an owner, it gets maintained, it’s documented, and it’s enforced at the tool level — not left to each engineer’s discretion.
CLAUDE.md — your team’s shared brain
CLAUDE.md is a markdown file that Claude reads at the start of every session. It’s your team’s standing brief — the things Claude needs to know about your project that aren’t obvious from reading the code.
The Claude Code documentation describes it as the place to encode “coding standards, architecture decisions, preferred libraries, and review checklists.” That’s the floor. The ceiling is anything you’d tell a new engineer on day one.
I’ve seen teams skip this and then spend weeks wondering why Claude keeps introducing the same anti-patterns. The CLAUDE.md is cheap to write and expensive to not have. Run claude /init inside any project and Claude will generate a starter file by reading your codebase — edit it down from there. A minimal, useful one looks like this:
# Project: Payments API
## Architecture
- Monolith split into bounded contexts under `src/contexts/`
- Each context owns its models, services, and tests — no cross-context imports
- Background jobs go through Sidekiq, not inline in the request cycle
## Conventions
- All new endpoints require request and response schemas (see `docs/schemas/`)
- Use `Result<T, E>` pattern for service layer — no raw exceptions bubbling up
- Tests use RSpec; prefer `let!` over `before` for readability
## What not to do
- Do not add gems without updating `docs/dependencies.md`
- Do not write raw SQL — use the query object pattern in `app/queries/`
- Do not modify migration files that have already run in production
## Review checklist before committing
- Does the change have tests?
- Does it update the relevant schema docs?
- Does it follow the Result pattern?
One practical constraint: keep CLAUDE.md under 200 lines. Files longer than that consume too much context window, and instruction adherence actually drops as the file grows.
That’s the first layer — project context: what Claude needs to know about the codebase. Most teams stop here. The teams that get consistent results add a second layer: workflow instructions — how Claude should approach work.
## Core principles
- Make every change as simple as possible — minimal impact, minimal code
- Find root causes; no temporary fixes
- Changes should only touch what's necessary
## Workflow
- Enter plan mode for any non-trivial task (3+ steps or architectural decisions)
- If something goes sideways, stop and re-plan immediately — don't keep pushing
- Never mark a task complete without proving it works
- Before committing: run tests, check logs, ask yourself "would a staff engineer approve this?"
## Subagent strategy
- Use subagents to keep the main context window clean
- Offload research, exploration, and parallel analysis to subagents
- One task per subagent for focused execution
The distinction matters: project context tells Claude what exists. Workflow instructions tell Claude how to behave. Most CLAUDE.md drift problems — where Claude keeps doing the thing you asked it not to do — come from only writing the first layer.
Scoping in monorepos
For monorepos, a single root CLAUDE.md isn’t enough — different parts of the repo have different conventions. Claude reads both the root file and any CLAUDE.md found in subdirectories, so you can scope instructions to where they apply:
your-repo/
├── CLAUDE.md ← global: repo structure, git conventions, shared tooling
├── apps/
│ ├── api/
│ │ └── CLAUDE.md ← API-specific: endpoints, auth patterns, schema rules
│ └── web/
│ └── CLAUDE.md ← frontend: component structure, state conventions
└── packages/
└── shared/
└── CLAUDE.md ← shared lib: what's public API, what's internal
Subdirectory files add to the root — they don’t replace it. Whatever is in the root CLAUDE.md is always in scope. This means the root is the right place for workflow instructions and cross-cutting conventions (test patterns, commit hygiene, shared tooling), and subdirectory files carry the project-context specifics for that package: its architecture, what not to touch, which patterns apply only there.
If you’d rather not scatter CLAUDE.md files across the tree, .claude/rules/ offers an alternative — separate topic files at the root that Claude loads automatically, organized by concern rather than by directory.
Who owns it
Treat CLAUDE.md like your README — it lives in the repo, it gets reviewed in PRs, and it gets updated when conventions change. On a larger team, whoever owns architectural decisions owns CLAUDE.md. Solo or small team, you own it.
One practical rule: any time a decision is made in a PR that you’d want Claude to respect going forward, add it to CLAUDE.md in that same PR. Keep the cost of updating it at zero.
The ownership question scales with team size. A three-person startup doesn’t need a designated CLAUDE.md steward — just the habit of treating it as a living document. A twenty-person org does need someone whose job it is to keep it from drifting.
The self-improvement loop
One of the more underused patterns: a tasks/lessons.md file that Claude updates after any correction. This requires the CLAUDE.md setup shown below — it’s not automatic. But once wired in, when you correct a mistake, Claude adds a rule to prevent that exact mistake in future sessions and reads the file before doing anything else at session start.
# Lessons
- Never modify migration files in place — create a new migration instead
- When adding a new endpoint, update the schema doc in `docs/schemas/` first
- The Result<T, E> pattern applies to all service layer methods, not just ones that might fail
The lessons file is committed and accumulates as you work. Every correction becomes a standing instruction that applies to every future session. This directly addresses the “Claude keeps making the same mistake” problem, which is otherwise solved only by repeating yourself.
Wire it into your CLAUDE.md workflow section:
## Self-improvement
After any correction from the user: update `tasks/lessons.md` with the pattern.
Write a rule that prevents the same mistake from recurring.
Review `tasks/lessons.md` at the start of each session.
The full .claude/ folder
CLAUDE.md is the most visible part of the setup, but it sits inside a larger structure. There are actually two .claude/ directories: the project-level folder committed to git, and a global ~/.claude/ folder in your home directory. The project folder holds team configuration — every engineer gets the same rules and commands. The global folder holds personal preferences and machine-local state that applies across all your projects.
your-project/
├── CLAUDE.md ← team instructions, committed
├── CLAUDE.local.md ← personal overrides, gitignored
└── .claude/
├── settings.json ← permissions + config, committed
├── settings.local.json ← personal permissions, gitignored
├── commands/ ← custom slash commands
│ ├── review.md → /project:review
│ └── deploy.md → /project:deploy
├── rules/ ← modular instruction files (path-scoped)
│ ├── code-style.md
│ ├── testing.md
│ └── api-conventions.md
└── agents/ ← subagent personas
└── code-reviewer.md
~/.claude/
├── CLAUDE.md ← your global instructions (all projects)
├── settings.json ← your global hooks and permissions
└── projects/ ← session history + auto-memory per project
Team vs. personal config. The committed/gitignored split matters. CLAUDE.md, settings.json, and .claude/commands/ are shared — every engineer gets them. CLAUDE.local.md and settings.local.json are personal — gitignored by default, for individual preferences and permission overrides that shouldn’t apply to everyone. The global ~/.claude/CLAUDE.md is the right place for personal coding principles that should apply regardless of which repo you’re working in.
Modular rules. A single CLAUDE.md works for small projects. As conventions grow, .claude/rules/ lets you split by concern — separate files for code style, testing conventions, and API patterns, each loaded automatically. The real power: add a paths: field in YAML frontmatter to scope a rule file to specific directories. A rule with paths: ["src/api/**/*.ts"] only loads when Claude is working in that path — it won’t appear when it’s editing a React component. Rules without a paths field load unconditionally every session.
Custom slash commands. .claude/commands/ turns shared workflows into slash commands. A file at .claude/commands/review.md becomes /project:review — available to every engineer, prompted identically, committed to the repo. Useful for review checklists, deploy runbooks, and diagnostic sequences you’d otherwise write from scratch each time.
The individual engineer’s productive loop
The workflow pattern that works consistently isn’t “describe what you want and review the output.” It’s a tighter loop with explicit checkpoints.
1. Plan before touching code
Before any file is opened or edited, explore the problem. Claude Code’s plan mode is designed for exactly this — read-only exploration of the codebase to surface questions and agree on an approach before anything changes.
Describe the problem, ask Claude to trace through the relevant code and propose an approach, review the proposal, push back on anything that doesn’t fit your constraints. Then implement.
The cost of a bad implementation plan is much higher than the cost of a longer planning conversation.
2. Write the tests first
Test-driven development is the single most effective pattern for agentic coding, and the reason is mechanical: failing tests give Claude unambiguous feedback. A test either passes or it doesn’t. There’s no room for “it looks right.”
The workflow:
1. Write the tests based on expected inputs and outputs
2. Confirm they fail (red)
3. Commit the failing tests — this is your checkpoint
4. Ask Claude to implement until the tests pass (green)
5. Review the diff before moving on
Step 3 matters more than it looks. Claude will sometimes pass tests by modifying them rather than fixing the implementation. Committing first makes that visible and gives you a clean rollback point.
3. Small, verifiable diffs
Don’t ask Claude to implement a feature end-to-end and review the entire output at once. Break it into steps, verify each one before the next, and commit checkpoints as you go.
A 600-line diff touching eight files is hard for any engineer to review. The 600-line PR I described at the start of this post was the result of skipping this step.
Before moving to the next step, apply a simple heuristic: would a staff engineer approve this without questions? Run the tests, check the logs, diff the behavior against what was there before. If you’d feel uncomfortable putting it up for review right now, it’s not done. This is a useful thing to put in your CLAUDE.md workflow section explicitly — Claude will apply it as a self-check before declaring anything complete.
4. Context hygiene
Long sessions degrade. A conversation that started with one problem and accumulated tangents, failed attempts, and revised requirements has noisy context — and Claude’s output reflects that noise.
Use /clear when switching to a new problem. Start fresh sessions for unrelated tasks. The session is not the project; the project is the codebase.
Anthropic’s engineering team writes about this as context pollution — the accumulation of irrelevant or contradictory information that degrades an agent’s coherence over time. The fix is the same as in any engineering system: garbage collection.
One concrete way to manage context is to treat subagent use as a deliberate strategy rather than an afterthought. The general pattern: use subagents to keep the main context window clean; offload research, exploration, and parallel analysis; give each subagent one task for focused execution. .claude/agents/ takes this further — named, isolated personas like code-reviewer or security-auditor that operate with their own context and only see what they need for that role.
What to delegate, what to own
The most important skill in an AI-assisted team is knowing where the delegation boundary is.
Safe to delegate fully — well-defined tasks with clear acceptance criteria that can be verified mechanically:
- Writing tests for untested code
- Fixing lint errors across a codebase
- Writing migration scripts for schema changes
- Updating dependencies and resolving conflicts
- Writing release notes from a git log
- Translating code between similar patterns (e.g., converting callbacks to async/await)
- Documentation for code that already exists
Delegate with oversight — the work is mechanical but the output needs a careful read:
- Implementing a spec that’s already been written and reviewed
- Refactoring to an established pattern with test coverage
- Generating boilerplate from a template
Own it yourself — the cost of a mistake is high, or the requirements are still being discovered as you build:
- Architecture decisions
- Security-sensitive code
- Product decisions embedded in implementation
- Any system where you’d have to explain the trade-offs in a postmortem
The principle: delegate tasks where verification is easier than authorship. Keep the things where judgment is the actual work.
The backlog AI actually unlocks
There’s a category of work that lives on every team’s backlog indefinitely — quality-of-life improvements that are low-value individually, tedious to do, and easy to defer in favor of anything else. Tests for legacy code. Fixing inconsistent error messages. Updating stale docs. Cleaning up dead feature flags.
This is where the delegation math is most favorable. Keep a running list, batch it, and schedule a session to clear it. The work was always real; it just wasn’t worth interrupting everything else for.
Enforcement over instruction
Instructions in CLAUDE.md are followed most of the time. Most of the time is not good enough for standards that actually matter.
Claude Code hooks let you run shell commands before or after specific actions — file edits, commits, tool calls. The distinction from instructions is fundamental: a hook is not a suggestion Claude can misread or skip, it’s a shell command that runs unconditionally.
Hooks receive context as JSON on stdin, parsed with jq. The settings file points to scripts:
// .claude/settings.json
{
"$schema": "https://json.schemastore.org/claude-code-settings.json",
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{ "type": "command", "command": "~/.claude/hooks/lint-on-edit.sh" }
]
}
],
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "~/.claude/hooks/test-before-commit.sh"
}
]
}
]
}
}
The scripts read tool input from stdin:
#!/bin/bash
# ~/.claude/hooks/lint-on-edit.sh
INPUT=$(cat)
FILE=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty')
[ -n "$FILE" ] && cd "$CLAUDE_PROJECT_DIR" && npx eslint --fix "$FILE" 2>&1 | head -20
#!/bin/bash
# ~/.claude/hooks/test-before-commit.sh
INPUT=$(cat)
CMD=$(echo "$INPUT" | jq -r '.tool_input.command // empty')
echo "$CMD" | grep -q '^git commit' && npm test 2>&1
Three things worth knowing before you write your first hook. Exit codes matter: exit 0 is success, exit 1 is a non-blocking error, and exit 2 is the only code that actually blocks execution — it stops Claude and sends your stderr back for self-correction. Using exit 1 for a security hook is the most common mistake; it logs an error and does nothing. PostToolUse can’t undo: the tool has already run by the time PostToolUse fires, so use PreToolUse for anything that needs to prevent an action, not just react to it. Stop hooks need a guard: a Stop hook that runs tests and exits with code 2 on failure will loop infinitely — Claude retries, the hook fires again, repeat. Check the stop_hook_active flag in the JSON payload and let Claude stop on the second attempt.
The rule of thumb: if the standard appears in CLAUDE.md, it’s a convention. If it appears in hooks, it’s enforced. Put in hooks anything you’d enforce in CI.
CI is still the gate
AI-generated code goes through the same pipeline as everything else. Green tests, passing lint, reviewed PR — the process doesn’t change because Claude wrote the code. If anything, the bar for review attention should be higher on large AI-generated diffs, not lower.
Anthropic ships an official GitHub Action (anthropics/claude-code-action) that runs Claude Code inside your pipeline and posts findings directly to pull requests:
# .github/workflows/claude-review.yml
name: Claude PR Review
on: [pull_request]
jobs:
review:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v4
- uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: $
prompt: |
Review this PR against our CLAUDE.md conventions.
Flag missing tests, security concerns, and architectural issues.
Post your findings as a PR comment.
GitLab CI/CD integration is documented separately at code.claude.com/docs/en/gitlab-ci-cd.
For teams wanting a purpose-built option, PRLens is an open-source AI review tool that goes further: it injects git history, file relationships, and paired test context into the review before submitting to Claude or GPT-4o, then posts inline comments directly to the PR. It’s what I use — disclosure: I built it.
Team conventions for AI-assisted work
Beyond individual workflow, a few team-level conventions make a material difference.
PRs look the same regardless of authorship. An AI-assisted PR has the same requirements as any other: a clear description, passing tests, scoped to one concern, reviewed before merge. The fact that Claude wrote the initial implementation is not relevant to the review.
Pair on the prompts. Pair programming adapts naturally — one engineer drives the conversation with Claude, one reviews the output in real time. This is particularly effective for unfamiliar domains: the reviewer catches things the driver is too close to the problem to notice.
Onboarding still covers the domain. New engineers should understand the system they’re working in, not just how to prompt Claude to implement things in it. The risk of skipping this is subtle — it shows up months later in architectural decisions that don’t fit, and in code reviews where nobody can explain why something was done a certain way.
Decide on transparency, then stick to it — but don’t underestimate how hard this is. Whether to disclose AI-assisted authorship is a genuine open question, and the team dynamics around it are more complicated than a working agreement can fully resolve.
Three things the “just decide” framing misses. First, disclosure changes review behavior — reviewers who know code is AI-generated scrutinize it differently, sometimes more carefully, sometimes less. This isn’t necessarily bad, but it means the team needs to agree not just on whether to disclose, but on what disclosure changes about the review process. Second, most AI-assisted code isn’t binary. Claude scaffolded it, you wrote the logic, Claude caught a bug, you restructured the approach. Where’s the line? A policy of “disclose all AI assistance” becomes unenforceable fast. Be specific about what you’re actually asking people to disclose. Third, career incentives work against disclosure in most organizations — the engineer who ships a large, clean PR gets credit regardless of how it was written. Disclosure introduces friction and potential skepticism. Unless the team actively normalizes AI-assisted work, the incentive is to say nothing, and a policy that runs against incentives isn’t really a policy.
Make the call, document it, apply it consistently. But go in knowing that “consistently” is the hard part, and that you’ll need to revisit it as norms evolve.
The honest conversation about craft
Speed creates a tension worth naming directly.
When generating output is fast and cheap, the incentive to slow down and understand something is weaker. That’s fine for tasks where understanding doesn’t compound — a migration script, a lint pass, translating boilerplate. It’s a real problem for the tasks where depth is what builds the engineer.
I’ve started using a distinction that’s been useful: use Claude to accelerate understanding, not to bypass it. In practice that means: when Claude writes code I need to maintain, I read it carefully before moving on. When I’m in an unfamiliar part of the codebase, I ask Claude to explain what’s happening before asking it to change anything. When I’m learning a new pattern, I implement the first instance myself and delegate the rest.
The “ask Claude to explain, not just write” habit is underused. It costs one extra turn and it’s often more informative than the implementation itself — Claude will surface edge cases, explain the reasoning behind a design choice, and flag what it would do differently given more context. That’s the feedback loop that builds expertise rather than replacing it.
This matters most for engineers who are still building the mental models that make delegation safe — and this is where the concern about AI-assisted development is most serious, and most underacknowledged.
The mechanism is specific: when you struggle with something — a bug you can’t explain, a design that doesn’t fit, a system you don’t understand — you build a mental model. The struggle is the learning. When Claude solves it for you, you skip the struggle and get the output. The output looks the same. The model doesn’t form.
For junior engineers, this creates a failure mode that’s invisible for a long time: they can produce work that looks senior-quality for months, until they encounter something Claude can’t handle well — an ambiguous requirement, a novel system design problem, a postmortem where they have to defend decisions they didn’t fully make. The gap between apparent capability and actual understanding only shows up under stress. By then, the pattern is established.
The problem is structural. AI tools create an incentive gradient that points away from productive struggle. The gain from delegating is immediate and visible. The learning cost is deferred and invisible. For an engineer on a deadline, the rational choice in the moment is always to ask Claude. Repeated across hundreds of decisions over months, this is how you get someone who can ship code they can’t explain.
A few patterns that help — for any engineer, but especially early-career:
The first-instance rule. Implement the first example of any new pattern yourself. Once you understand it, delegate the repetitions. The tenth migration script is mechanical. The first one is how you learn what migrations actually do.
The explanation habit. Before accepting Claude-written code, ask Claude to explain it — not as a check on Claude, but as a check on yourself. If you can’t follow the explanation, you can’t maintain the code. Reading and understanding is the minimum bar, not an optional step.
Distinguish mechanical from learning tasks. Lint fixes, boilerplate, repetitive transformations — delegation is fine, the learning value is zero. A non-trivial algorithm, a system you’ve never touched, a pattern you’re implementing for the first time — these are different. Be intentional about which category you’re in before reaching for Claude.
What it actually takes
None of this is complicated. All of it requires intention.
The teams compounding with AI have a CLAUDE.md they keep current — with both project context and workflow instructions — a .claude/ folder committed and maintained like any other infrastructure, and a lessons file that turns every correction into a standing rule. They have hooks that enforce the standards that matter, TDD as the feedback mechanism for implementation, and a clear sense of what to delegate and what to own. They’ve had the direct conversation about craft. And they’ve agreed on what their PRs should look like regardless of who — or what — wrote the first draft.
The teams waiting for the next model to fix their results are the ones that bolted it on.
References: Claude Code documentation · Effective context engineering for AI agents · anthropics/claude-code-action