title: "AGENTS.md at Scale: Enterprise Guide for 180+ Engineers and Hundreds of Repos" date: "2026-04-05" description: "Why most AGENTS.md files make agents worse, the 4-question filter for context files, and how to manage agent context across hundreds of repos" tags: ["agentic-ai", "agents-md", "enterprise", "context-management", "claude-code", "codex"] type: research topic: "Agent Context Management" author: "Cash" aiModel: "research" draft: false
AGENTS.md at Scale: Enterprise Guide for 180+ Engineers and Hundreds of Repos
Context: VP Engineering, 180 devs, React + Golang, GCP/K8s/Spanner/Kafka, 10+ deploys/day Tools: Claude Code, Codex, OpenAI SDK, Claude Agent SDK, GCP Agentic Workloads
The Headlines
- AGENTS.md works - but badly written ones make agents worse. ETH Zurich research: context files reduced task success by making agents slower and more expensive, unless they're well-crafted.
- Most AGENTS.md files are junk drawers. Generic rules, folder structures the agent can see, linter-enforced style. Delete that noise.
- The 4-question filter: For every line, ask: (1) Failure-backed? (2) Tool-enforceable? (3) Decision-encoding? (4) Triggerable? If no to all, delete.
- Centralize what's common, specialize what's unique. Organization-wide template for conventions, repo-specific for architecture and commands.
- Lint your AGENTS.md files. They drift. They rot. They contradict each other. Automate quality checks.
Part 1: Why AGENTS.md Matters (The Research)
The ETH Zurich Study
In February 2026, ETH Zurich published a study of 2,303 agent context files across 1,925 repositories. The findings were stark:
"Context files reduce task success rates compared to providing no repository context, while increasing inference cost by over 20%."
Let that land. Bad context files make agents perform worse than nothing - and cost more.
But there's nuance. The study found that well-crafted context files improved performance by ~4%. The problem: most aren't well-crafted.
What they found:
| Metric | Finding |
|---|---|
| Median length | 335-535 words depending on tool |
| Readability | "Very difficult" (FRE 16-40, academic/legal level) |
| Update frequency | 59-67% modified multiple times |
| Update interval | 22-70 hours (short bursts) |
| Deletions | Minimal (files grow, never shrink) |
Content analysis:
| Category | % of Files |
|---|---|
| Testing | 75.0% |
| Implementation Details | 69.9% |
| Architecture | 67.7% |
| Development Process | 63.3% |
| Build and Run | 62.3% |
| System Overview | 59.0% |
| Security | 14.5% |
| Performance | 14.5% |
The gap is obvious: teams optimize for making agents functional, but few provide guardrails for security or performance.
Source: Agent READMEs: An Empirical Study (arXiv)
The Augment Analysis
Augment analyzed AGENTS.md files across the ecosystem. Their diagnosis:
"Your AGENTS.md has how many instructions? More rules, worse output. Because we don't trust the agent."
The pattern: developers write 200-line AGENTS.md files explaining folder structure because they don't believe the agent can figure it out. But modern agents can see the codebase. They don't need you to explain what's already visible.
What agents can already see:
- Code structure (file tree, imports, dependencies)
- Tech stack (package.json, go.mod, Cargo.toml)
- Existing patterns (by reading the code)
- Git history
- Linter configs
What agents can't see:
- Build and test commands (unless documented)
- Deploy steps
- Team conventions that live in heads, not files
- Why that weird architecture decision was made
- Known gotchas
The mistake: Using category 2 tools for category 1 problems. Writing AGENTS.md files that explain what the agent could discover by reading the repo.
Source: Augment: Your Agent's Context Is a Junk Drawer
The Vercel Evals
Vercel ran benchmarks on Next.js 16 API tasks. They compared two approaches:
- Skills (on-demand retrieval): Agent has access to docs, retrieves what it needs.
- AGENTS.md (passive context): Compressed docs index in a single file.
Result: Skills produced zero improvement. The agent never bothered to look at the docs.
Then they tried the "dumb" approach: compressed the entire docs index into an 8KB AGENTS.md file. Not full documentation - just an index pointing to retrievable files.
100% pass rate across build, lint, and test.
40KB compressed to 8KB. Perfect score. The dumb approach won.
Lesson: Agents are lazy. They won't retrieve unless you put it in their face. A well-structured index beats comprehensive documentation.
Source: Vercel: AGENTS.md Outperforms Skills
Part 2: What Belongs in AGENTS.md (The Filter)
The 4-Question Test
From Jan-Niklas Wortmann's analysis, who went from 80+ lines of rules to 30 lines of "dramatically better behavior":
For each line in your AGENTS.md, ask:
- Failure-backed? - Can you point to a specific failure this prevents? If no, delete.
- Tool-enforceable? - Could a linter, formatter, or CI check enforce this? If yes, move it there, don't duplicate in AGENTS.md.
- Decision-encoding? - Does this encode a team decision that isn't obvious from the code? If no, delete.
- Triggerable? - Is this actionable at a specific moment, or is it generic advice? If generic, delete.
If a line fails all four, delete it.
Source: Wordman: Agent Instructions
What to DELETE
These almost never belong:
| Delete | Why |
|---|---|
| Folder structure descriptions | Agent can see it by reading the repo |
| Tech stack restatements | It's in package.json, go.mod, etc. |
| Linter-enforced style rules | "Use tabs" when .editorconfig says spaces - agent sees the config |
| Generic best practices | "Write clean code" - agent was trained on the internet |
| SOLID principles, DRY, etc. | Trained on these. Redundant. |
| API patterns visible in code | Agent can read existing implementations |
The Augment rule:
"Never send an LLM to do a linter's job."
What to KEEP
These belong in AGENTS.md:
| Keep | Example |
|---|---|
| Build/test/lint commands | make test, npm run build:prod |
| Deploy steps | How to deploy to staging, production |
| Environment setup | Dev environment gotchas, secrets management |
| Team conventions in heads | "We always use Result types for errors in Go services" |
| Architecture decisions | Why the trading engine is separate from the API layer |
| Known gotchas | "Don't touch the legacy pricing module - it's fragile" |
| Security requirements | "All new endpoints must use the auth middleware" |
| Performance constraints | "Trading API must respond in under 50ms p99" |
The Structure That Works
From the ETH Zurich study: successful AGENTS.md files follow a shallow hierarchy:
- Single H1 heading (treat as unified document)
- 6-7 H2 sections for major topics
- Some H3/H4 for detail
- Rarely deeper
Recommended structure:
# Project Name AGENTS.md
## Build & Run
[Commands, scripts, environment setup]
## Test
[How to run tests, coverage requirements]
## Architecture
[High-level design, key components, why decisions were made]
## Conventions
[Team-specific patterns not visible in linter configs]
## Guardrails
[Things not to touch, security requirements, performance constraints]
## Deploy
[Staging, production steps, CI/CD pointers]Keep it under 300 lines. Under 200 is better. Every line costs attention budget.
Part 3: Managing AGENTS.md Across Hundreds of Repos
The Problem
You have 100+ repos. Maybe 200+. Each needs an AGENTS.md. But:
- Consistency problem: Different teams write different conventions
- Drift problem: Files rot, contradict each other, reference obsolete commands
- Maintenance problem: Who updates all 200 files when a convention changes?
- Discovery problem: How do you know what's in each AGENTS.md?
The Solution: Template Inheritance
Three-layer model:
┌─────────────────────────────────────────────────────────┐
│ Layer 1: ORG-AGENTS.md (global template) │
│ - Organization-wide conventions │
│ - Security requirements │
│ - Performance standards │
│ - Tool versions, CI pointers │
│ - One file, maintained by platform team │
└─────────────────────────────────────────────────────────┘
│
│ imported by
▼
┌─────────────────────────────────────────────────────────┐
│ Layer 2: AGENTS.md (repo-specific) │
│ - Build/test/deploy commands │
│ - Architecture overview │
│ - Repo-specific conventions │
│ - Known gotchas │
│ - One per repo, maintained by repo owner │
└─────────────────────────────────────────────────────────┘
│
│ references
▼
┌─────────────────────────────────────────────────────────┐
│ Layer 3: docs/ (detailed references) │
│ - Architecture decision records (ADRs) │
│ - API documentation │
│ - Runbooks │
│ - Detailed procedures │
│ - Linked from AGENTS.md, not embedded │
└─────────────────────────────────────────────────────────┘
Layer 1: ORG-AGENTS.md (The Global Template)
What goes here:
- Organization-wide coding standards
- Security requirements (all endpoints must use auth middleware)
- Performance standards (all trading APIs must respond under 50ms)
- Approved tool versions (Go 1.24, Node 22, React 19)
- CI/CD pointers (all repos use GitHub Actions, here's the workflow)
- Conventional commit format
- PR review requirements
What does NOT go here:
- Repo-specific commands (each repo has different build)
- Repo-specific architecture (trading engine vs web frontend)
- Repo-specific gotchas
How it works:
Option A: Monorepo approach - One AGENTS.md at root, per-package sections
- Works if you're already monorepo
- Requires careful organization
Option B: Template inheritance - Each repo imports org template
- ORG-AGENTS.md lives in a template repo
- Each repo's AGENTS.md starts with: "See ORG-AGENTS.md for org-wide conventions. Repo-specific below:"
- Agent reads both
Option C: Centralized context server - Agents fetch org context at runtime
- ORG-AGENTS.md served via HTTP
- Agents configured to fetch at session start
- Works well with GCP Agentic Workloads and Claude/OpenAI SDKs
Recommendation for your stack (GCP Agentic Workloads):
Use Option C. Configure your Vertex AI agents to fetch org context from a central location:
Agent initialization:
1. Fetch https://internal.yourcompany.com/agents/org-context.md
2. Read repo's local AGENTS.md
3. Merge: org context first, repo context overlays
This way:
- One source of truth for org conventions
- Repo AGENTS.md files stay lean
- Updates propagate immediately
Layer 2: AGENTS.md (The Repo-Specific File)
Template for each repo:
# [repo-name] AGENTS.md
> Org-wide conventions: See [ORG-AGENTS.md](link). This file is repo-specific.
## Build & Run
[Specific commands for this repo]
## Test
[Specific test commands, coverage targets]
## Architecture
[This repo's architecture, key components]
## Conventions
[Repo-specific conventions that differ from or extend org conventions]
## Guardrails
[Things not to touch in this repo, security gotchas]
## Deploy
[Staging/production steps for this repo]
## Gotchas
[Known issues, legacy code to avoid, etc.]Size target: 100-200 lines. If larger, split into docs/ and link.
Layer 3: docs/ (Detailed References)
What goes here:
- Architecture Decision Records (ADRs) - why decisions were made
- API documentation
- Runbooks for operations
- Detailed procedures that would bloat AGENTS.md
How to link:
## Architecture
See [docs/architecture.md](docs/architecture.md) for full architecture overview.
Key points:
- Trading engine is separate from API layer for latency reasons
- All state lives in Cloud Spanner
- Kafka for event streamingWhy this matters: AGENTS.md is the index. docs/ is the library. Agents are lazy - they'll read what's in front of them. Put the index in AGENTS.md, not the full documentation.
Part 4: Maintaining Consistency
The Drift Problem
From the ETH Zurich study:
- Agent context files evolve through additions, not deletions
- Median update interval: 22-70 hours (short bursts)
- Files grow over time, never shrink
Result: AGENTS.md files rot. They accumulate obsolete commands, reference deleted files, contradict newer conventions.
Solution: Lint Your AGENTS.md
What to check:
| Check | How | Why |
|---|---|---|
| Line count | Fail if > 300 lines | Force pruning |
| Word count | Warn if > 1000 words | Context window cost |
| File references | Check if referenced files exist | Commands may reference deleted files |
| Generic phrases | Flag "follow best practices", "be helpful" | Weak instruction, no value |
| Contradictions | Flag "use tabs" + "use spaces" in same file | Agents silently pick one |
| Required sections | Fail if missing Build, Test, Deploy | Essential context |
| Security section | Warn if missing | 14.5% have this - should be higher |
Tooling:
Option A: Vale (prose linter)
- Define patterns for weak phrases
- Configure severity (suggestion/warning/error)
- Run in CI
Option B: Custom script
- Shell script for structural checks
- Add to pre-commit hooks and CI
Option C: Existing linters
- markdownlint for structure
- Custom rules for AGENTS.md-specific checks
Example script:
#!/bin/bash
# AGENTS.md linter
FILE="AGENTS.md"
MAX_LINES=300
# Line count
line_count=$(wc -l < "$FILE" | tr -d ' ')
if [ "$line_count" -gt "$MAX_LINES" ]; then
echo "ERROR: AGENTS.md is $line_count lines (max $MAX_LINES)"
echo "Suggestion: move detailed procedures into docs/ and link from AGENTS.md"
exit 1
fi
# Required sections
required=("Build" "Test" "Architecture" "Deploy")
for section in "${required[@]}"; do
if ! grep -q "## .*$section" "$FILE"; then
echo "ERROR: Missing required section: $section"
exit 1
fi
done
# Generic phrases (weak instruction)
if grep -qiE "(follow best practices|be helpful|be concise|write clean code)" "$FILE"; then
echo "WARNING: Generic phrases detected. Replace with specific instructions."
fi
echo "PASS: AGENTS.md checks passed"Run in CI:
- Add to GitHub Actions / Cloud Build
- Fail PR if AGENTS.md doesn't pass
- Prevents drift from entering
Source: DEV: Practical Linting for Agent Context Files
The Update Cadence
When to update AGENTS.md:
| Trigger | Who | What |
|---|---|---|
| New build command | Dev making change | Add to Build section |
| Architecture decision | Tech lead | Add to Architecture, create ADR |
| Incident caused by agent mistake | Anyone | Add guardrail to prevent recurrence |
| Security requirement | Security team | Update ORG-AGENTS.md |
| Quarterly audit | Platform team | Review all AGENTS.md for drift |
The rule: Every time an agent makes a mistake that a better instruction would have prevented, update AGENTS.md. If an instruction didn't prevent the mistake, delete or rewrite it.
Part 5: What to Generalize vs. Specialize
Generalize (ORG-AGENTS.md)
| Category | Example |
|---|---|
| Security requirements | All endpoints must use auth middleware |
| Performance standards | All trading APIs must respond under 50ms p99 |
| Coding conventions | Use conventional commits (feat/fix/refactor) |
| Tool versions | Go 1.24, Node 22, React 19 |
| CI/CD pointers | All repos use GitHub Actions |
| Review requirements | All PRs require 2 approvals |
| Testing standards | All repos must have >80% coverage |
| Documentation standards | All repos must have AGENTS.md |
Specialize (repo AGENTS.md)
| Category | Example |
|---|---|
| Build commands | make build, npm run build:prod |
| Test commands | make test, go test ./..., npm test |
| Architecture | Trading engine is separate from API layer |
| Repo-specific conventions | Use Result types for errors in this repo |
| Known gotchas | Don't touch legacy pricing module |
| Deploy steps | kubectl apply -f staging.yaml |
| Environment setup | Run scripts/setup-env.sh first |
The Test
Ask: "Does this apply to every repo in the org?"
- Yes → ORG-AGENTS.md
- No → repo AGENTS.md
- Depends → ORG-AGENTS.md with override capability in repo AGENTS.md
Part 6: Tool-Specific Notes
Claude Code
- Reads
CLAUDE.mdfrom repo root - Also reads
~/.claude/CLAUDE.mdfor user-level context - Reads recursively up directory tree (can have CLAUDE.md in subdirectories)
- Priority: repo > parent directory > user home
For your stack: Create CLAUDE.md as symlink to AGENTS.md, or maintain both if conventions differ.
OpenAI Codex
- Reads
AGENTS.mdfrom repo root - Official guidance: describe architecture, workflows, commands
- Works with GitHub Actions for CI integration
For your stack: AGENTS.md is the primary file. Codex will use it directly.
GCP Agentic Workloads / Vertex AI Agent Builder
- Can fetch context from external sources at runtime
- Configure agents to pull from central org context
- Supports multiple context files merged at inference
For your stack:
- Host ORG-AGENTS.md on internal HTTP endpoint
- Configure Vertex AI agents to fetch at session start
- Repo AGENTS.md files remain lean and repo-specific
Claude Agent SDK / OpenAI SDK
- Context passed programmatically
- Can inject org context before repo context
- Full control over context assembly
For your stack:
# Pseudocode
org_context = fetch("https://internal.yourcompany.com/agents/org-context.md")
repo_context = read_file("AGENTS.md")
full_context = org_context + "\n\n" + repo_context
agent.run(prompt, context=full_context)Part 7: Action Plan
Week 1: Audit
- Find all existing AGENTS.md / CLAUDE.md files across repos
- Run the 4-question filter on each
- Identify common sections (candidates for org template)
- Identify contradictions between repos
- Create inventory of what's in each file
Week 2: Create ORG-AGENTS.md
- Draft org-wide template with platform team
- Include: security requirements, performance standards, tool versions, CI pointers
- Host on internal HTTP endpoint (for GCP Agentic Workloads)
- Configure Vertex AI agents to fetch at session start
Week 3: Update Repo AGENTS.md Files
- Prioritize: highest-traffic repos first
- Apply template: org context link + repo-specific sections
- Run through 4-question filter
- Target: 100-200 lines each
Week 4: Add Linting
- Create AGENTS.md linter script
- Add to CI (GitHub Actions / Cloud Build)
- Fail PRs if AGENTS.md doesn't pass
- Add to pre-commit hooks for local dev
Ongoing
- Quarterly audit of all AGENTS.md files
- Update ORG-AGENTS.md when conventions change
- Add guardrails when agent mistakes occur
- Delete instructions that don't prevent failures
Build Queue for Tango
-
AGENTS.md Linter
- Input: AGENTS.md file path
- Output: Pass/fail with specific issues
- Checks: line count, required sections, generic phrases, file references
- Tech: Golang CLI, runs in CI
-
AGENTS.md Generator
- Input: Repo URL or local path
- Output: Draft AGENTS.md extracted from codebase
- Extract: build commands, test commands, architecture hints
- Tech: Golang CLI, static analysis
-
ORG-AGENTS.md Server
- Input: HTTP request from Vertex AI agents
- Output: Current org-wide context
- Features: versioning, audit log, update API
- Tech: Golang HTTP server, GCP Cloud Run
-
AGENTS.md Dashboard
- Input: GitHub org, scans all repos
- Output: Inventory of AGENTS.md files, compliance status, drift detection
- Tech: React frontend, Golang backend
Sources
- Agent READMEs: An Empirical Study (arXiv)
- Augment: Your Agent's Context Is a Junk Drawer
- Vercel: AGENTS.md Outperforms Skills
- Wordman: Agent Instructions
- DEV: Practical Linting for Agent Context Files
- GitHub: How to Write a Great AGENTS.md
- Harness: The Agent-Native Repo
- Medium: The Complete Guide to AI Agent Memory Files
Research by Cash | April 2026