title: "AGENTS.md at Scale: Enterprise Guide for 180+ Engineers and Hundreds of Repos" date: "2026-04-05" description: "Why most AGENTS.md files make agents worse, the 4-question filter for context files, and how to manage agent context across hundreds of repos" tags: ["agentic-ai", "agents-md", "enterprise", "context-management", "claude-code", "codex"] type: research topic: "Agent Context Management" author: "Cash" aiModel: "research" draft: false

AGENTS.md at Scale: Enterprise Guide for 180+ Engineers and Hundreds of Repos

Context: VP Engineering, 180 devs, React + Golang, GCP/K8s/Spanner/Kafka, 10+ deploys/day Tools: Claude Code, Codex, OpenAI SDK, Claude Agent SDK, GCP Agentic Workloads

The Headlines

AGENTS.md works - but badly written ones make agents worse. ETH Zurich research: context files reduced task success by making agents slower and more expensive, unless they're well-crafted.
Most AGENTS.md files are junk drawers. Generic rules, folder structures the agent can see, linter-enforced style. Delete that noise.
The 4-question filter: For every line, ask: (1) Failure-backed? (2) Tool-enforceable? (3) Decision-encoding? (4) Triggerable? If no to all, delete.
Centralize what's common, specialize what's unique. Organization-wide template for conventions, repo-specific for architecture and commands.
Lint your AGENTS.md files. They drift. They rot. They contradict each other. Automate quality checks.

Part 1: Why AGENTS.md Matters (The Research)

The ETH Zurich Study

In February 2026, ETH Zurich published a study of 2,303 agent context files across 1,925 repositories. The findings were stark:

"Context files reduce task success rates compared to providing no repository context, while increasing inference cost by over 20%."

Let that land. Bad context files make agents perform worse than nothing - and cost more.

But there's nuance. The study found that well-crafted context files improved performance by ~4%. The problem: most aren't well-crafted.

What they found:

Metric	Finding
Median length	335-535 words depending on tool
Readability	"Very difficult" (FRE 16-40, academic/legal level)
Update frequency	59-67% modified multiple times
Update interval	22-70 hours (short bursts)
Deletions	Minimal (files grow, never shrink)

Content analysis:

Category	% of Files
Testing	75.0%
Implementation Details	69.9%
Architecture	67.7%
Development Process	63.3%
Build and Run	62.3%
System Overview	59.0%
Security	14.5%
Performance	14.5%

The gap is obvious: teams optimize for making agents functional, but few provide guardrails for security or performance.

Source: Agent READMEs: An Empirical Study (arXiv)

The Augment Analysis

Augment analyzed AGENTS.md files across the ecosystem. Their diagnosis:

"Your AGENTS.md has how many instructions? More rules, worse output. Because we don't trust the agent."

The pattern: developers write 200-line AGENTS.md files explaining folder structure because they don't believe the agent can figure it out. But modern agents can see the codebase. They don't need you to explain what's already visible.

What agents can already see:

Code structure (file tree, imports, dependencies)
Tech stack (package.json, go.mod, Cargo.toml)
Existing patterns (by reading the code)
Git history
Linter configs

What agents can't see:

Build and test commands (unless documented)
Deploy steps
Team conventions that live in heads, not files
Why that weird architecture decision was made
Known gotchas

The mistake: Using category 2 tools for category 1 problems. Writing AGENTS.md files that explain what the agent could discover by reading the repo.

Source: Augment: Your Agent's Context Is a Junk Drawer

The Vercel Evals

Vercel ran benchmarks on Next.js 16 API tasks. They compared two approaches:

Skills (on-demand retrieval): Agent has access to docs, retrieves what it needs.
AGENTS.md (passive context): Compressed docs index in a single file.

Result: Skills produced zero improvement. The agent never bothered to look at the docs.

Then they tried the "dumb" approach: compressed the entire docs index into an 8KB AGENTS.md file. Not full documentation - just an index pointing to retrievable files.

100% pass rate across build, lint, and test.

40KB compressed to 8KB. Perfect score. The dumb approach won.

Lesson: Agents are lazy. They won't retrieve unless you put it in their face. A well-structured index beats comprehensive documentation.

Source: Vercel: AGENTS.md Outperforms Skills

Part 2: What Belongs in AGENTS.md (The Filter)

The 4-Question Test

From Jan-Niklas Wortmann's analysis, who went from 80+ lines of rules to 30 lines of "dramatically better behavior":

For each line in your AGENTS.md, ask:

Failure-backed? - Can you point to a specific failure this prevents? If no, delete.
Tool-enforceable? - Could a linter, formatter, or CI check enforce this? If yes, move it there, don't duplicate in AGENTS.md.
Decision-encoding? - Does this encode a team decision that isn't obvious from the code? If no, delete.
Triggerable? - Is this actionable at a specific moment, or is it generic advice? If generic, delete.

If a line fails all four, delete it.

Source: Wordman: Agent Instructions

What to DELETE

These almost never belong:

Delete	Why
Folder structure descriptions	Agent can see it by reading the repo
Tech stack restatements	It's in package.json, go.mod, etc.
Linter-enforced style rules	"Use tabs" when .editorconfig says spaces - agent sees the config
Generic best practices	"Write clean code" - agent was trained on the internet
SOLID principles, DRY, etc.	Trained on these. Redundant.
API patterns visible in code	Agent can read existing implementations

The Augment rule:

"Never send an LLM to do a linter's job."

What to KEEP

These belong in AGENTS.md:

Keep	Example
Build/test/lint commands	`make test`, `npm run build:prod`
Deploy steps	How to deploy to staging, production
Environment setup	Dev environment gotchas, secrets management
Team conventions in heads	"We always use Result types for errors in Go services"
Architecture decisions	Why the trading engine is separate from the API layer
Known gotchas	"Don't touch the legacy pricing module - it's fragile"
Security requirements	"All new endpoints must use the auth middleware"
Performance constraints	"Trading API must respond in under 50ms p99"

The Structure That Works

From the ETH Zurich study: successful AGENTS.md files follow a shallow hierarchy:

Single H1 heading (treat as unified document)
6-7 H2 sections for major topics
Some H3/H4 for detail
Rarely deeper

Recommended structure:

# Project Name AGENTS.md
 
## Build & Run
[Commands, scripts, environment setup]
 
## Test
[How to run tests, coverage requirements]
 
## Architecture
[High-level design, key components, why decisions were made]
 
## Conventions
[Team-specific patterns not visible in linter configs]
 
## Guardrails
[Things not to touch, security requirements, performance constraints]
 
## Deploy
[Staging, production steps, CI/CD pointers]

Keep it under 300 lines. Under 200 is better. Every line costs attention budget.

Part 3: Managing AGENTS.md Across Hundreds of Repos

The Problem

You have 100+ repos. Maybe 200+. Each needs an AGENTS.md. But:

Consistency problem: Different teams write different conventions
Drift problem: Files rot, contradict each other, reference obsolete commands
Maintenance problem: Who updates all 200 files when a convention changes?
Discovery problem: How do you know what's in each AGENTS.md?

The Solution: Template Inheritance

Three-layer model:

┌─────────────────────────────────────────────────────────┐
│ Layer 1: ORG-AGENTS.md (global template)                │
│ - Organization-wide conventions                         │
│ - Security requirements                                 │
│ - Performance standards                                 │
│ - Tool versions, CI pointers                            │
│ - One file, maintained by platform team                 │
└─────────────────────────────────────────────────────────┘
                           │
                           │ imported by
                           ▼
┌─────────────────────────────────────────────────────────┐
│ Layer 2: AGENTS.md (repo-specific)                      │
│ - Build/test/deploy commands                            │
│ - Architecture overview                                 │
│ - Repo-specific conventions                             │
│ - Known gotchas                                         │
│ - One per repo, maintained by repo owner                │
└─────────────────────────────────────────────────────────┘
                           │
                           │ references
                           ▼
┌─────────────────────────────────────────────────────────┐
│ Layer 3: docs/ (detailed references)                    │
│ - Architecture decision records (ADRs)                  │
│ - API documentation                                     │
│ - Runbooks                                              │
│ - Detailed procedures                                   │
│ - Linked from AGENTS.md, not embedded                   │
└─────────────────────────────────────────────────────────┘

Layer 1: ORG-AGENTS.md (The Global Template)

What goes here:

Organization-wide coding standards
Security requirements (all endpoints must use auth middleware)
Performance standards (all trading APIs must respond under 50ms)
Approved tool versions (Go 1.24, Node 22, React 19)
CI/CD pointers (all repos use GitHub Actions, here's the workflow)
Conventional commit format
PR review requirements

What does NOT go here:

Repo-specific commands (each repo has different build)
Repo-specific architecture (trading engine vs web frontend)
Repo-specific gotchas

How it works:

Option A: Monorepo approach - One AGENTS.md at root, per-package sections

Works if you're already monorepo
Requires careful organization

Option B: Template inheritance - Each repo imports org template

ORG-AGENTS.md lives in a template repo
Each repo's AGENTS.md starts with: "See ORG-AGENTS.md for org-wide conventions. Repo-specific below:"
Agent reads both

Option C: Centralized context server - Agents fetch org context at runtime

ORG-AGENTS.md served via HTTP
Agents configured to fetch at session start
Works well with GCP Agentic Workloads and Claude/OpenAI SDKs

Recommendation for your stack (GCP Agentic Workloads):

Use Option C. Configure your Vertex AI agents to fetch org context from a central location:

Agent initialization:
1. Fetch https://internal.yourcompany.com/agents/org-context.md
2. Read repo's local AGENTS.md
3. Merge: org context first, repo context overlays

This way:

One source of truth for org conventions
Repo AGENTS.md files stay lean
Updates propagate immediately

Layer 2: AGENTS.md (The Repo-Specific File)

Template for each repo:

# [repo-name] AGENTS.md
 
> Org-wide conventions: See [ORG-AGENTS.md](link). This file is repo-specific.
 
## Build & Run
[Specific commands for this repo]
 
## Test
[Specific test commands, coverage targets]
 
## Architecture
[This repo's architecture, key components]
 
## Conventions
[Repo-specific conventions that differ from or extend org conventions]
 
## Guardrails
[Things not to touch in this repo, security gotchas]
 
## Deploy
[Staging/production steps for this repo]
 
## Gotchas
[Known issues, legacy code to avoid, etc.]

Size target: 100-200 lines. If larger, split into docs/ and link.

Layer 3: docs/ (Detailed References)

What goes here:

Architecture Decision Records (ADRs) - why decisions were made
API documentation
Runbooks for operations
Detailed procedures that would bloat AGENTS.md

How to link:

## Architecture
See [docs/architecture.md](docs/architecture.md) for full architecture overview.
Key points:
- Trading engine is separate from API layer for latency reasons
- All state lives in Cloud Spanner
- Kafka for event streaming

Why this matters: AGENTS.md is the index. docs/ is the library. Agents are lazy - they'll read what's in front of them. Put the index in AGENTS.md, not the full documentation.

Part 4: Maintaining Consistency

The Drift Problem

From the ETH Zurich study:

Agent context files evolve through additions, not deletions
Median update interval: 22-70 hours (short bursts)
Files grow over time, never shrink

Result: AGENTS.md files rot. They accumulate obsolete commands, reference deleted files, contradict newer conventions.

Solution: Lint Your AGENTS.md

What to check:

Check	How	Why
Line count	Fail if > 300 lines	Force pruning
Word count	Warn if > 1000 words	Context window cost
File references	Check if referenced files exist	Commands may reference deleted files
Generic phrases	Flag "follow best practices", "be helpful"	Weak instruction, no value
Contradictions	Flag "use tabs" + "use spaces" in same file	Agents silently pick one
Required sections	Fail if missing Build, Test, Deploy	Essential context
Security section	Warn if missing	14.5% have this - should be higher

Tooling:

Option A: Vale (prose linter)

Define patterns for weak phrases
Configure severity (suggestion/warning/error)
Run in CI

Option B: Custom script

Shell script for structural checks
Add to pre-commit hooks and CI

Option C: Existing linters

markdownlint for structure
Custom rules for AGENTS.md-specific checks

Example script:

#!/bin/bash
# AGENTS.md linter
 
FILE="AGENTS.md"
MAX_LINES=300
 
# Line count
line_count=$(wc -l < "$FILE" | tr -d ' ')
if [ "$line_count" -gt "$MAX_LINES" ]; then
  echo "ERROR: AGENTS.md is $line_count lines (max $MAX_LINES)"
  echo "Suggestion: move detailed procedures into docs/ and link from AGENTS.md"
  exit 1
fi
 
# Required sections
required=("Build" "Test" "Architecture" "Deploy")
for section in "${required[@]}"; do
  if ! grep -q "## .*$section" "$FILE"; then
    echo "ERROR: Missing required section: $section"
    exit 1
  fi
done
 
# Generic phrases (weak instruction)
if grep -qiE "(follow best practices|be helpful|be concise|write clean code)" "$FILE"; then
  echo "WARNING: Generic phrases detected. Replace with specific instructions."
fi
 
echo "PASS: AGENTS.md checks passed"

Run in CI:

Add to GitHub Actions / Cloud Build
Fail PR if AGENTS.md doesn't pass
Prevents drift from entering

Source: DEV: Practical Linting for Agent Context Files

The Update Cadence

When to update AGENTS.md:

Trigger	Who	What
New build command	Dev making change	Add to Build section
Architecture decision	Tech lead	Add to Architecture, create ADR
Incident caused by agent mistake	Anyone	Add guardrail to prevent recurrence
Security requirement	Security team	Update ORG-AGENTS.md
Quarterly audit	Platform team	Review all AGENTS.md for drift

The rule: Every time an agent makes a mistake that a better instruction would have prevented, update AGENTS.md. If an instruction didn't prevent the mistake, delete or rewrite it.

Part 5: What to Generalize vs. Specialize

Generalize (ORG-AGENTS.md)

Category	Example
Security requirements	All endpoints must use auth middleware
Performance standards	All trading APIs must respond under 50ms p99
Coding conventions	Use conventional commits (feat/fix/refactor)
Tool versions	Go 1.24, Node 22, React 19
CI/CD pointers	All repos use GitHub Actions
Review requirements	All PRs require 2 approvals
Testing standards	All repos must have >80% coverage
Documentation standards	All repos must have AGENTS.md

Specialize (repo AGENTS.md)

Category	Example
Build commands	`make build`, `npm run build:prod`
Test commands	`make test`, `go test ./...`, `npm test`
Architecture	Trading engine is separate from API layer
Repo-specific conventions	Use Result types for errors in this repo
Known gotchas	Don't touch legacy pricing module
Deploy steps	`kubectl apply -f staging.yaml`
Environment setup	Run `scripts/setup-env.sh` first

The Test

Ask: "Does this apply to every repo in the org?"

Yes → ORG-AGENTS.md
No → repo AGENTS.md
Depends → ORG-AGENTS.md with override capability in repo AGENTS.md

Part 6: Tool-Specific Notes

Claude Code

Reads CLAUDE.md from repo root
Also reads ~/.claude/CLAUDE.md for user-level context
Reads recursively up directory tree (can have CLAUDE.md in subdirectories)
Priority: repo > parent directory > user home

For your stack: Create CLAUDE.md as symlink to AGENTS.md, or maintain both if conventions differ.

OpenAI Codex

Reads AGENTS.md from repo root
Official guidance: describe architecture, workflows, commands
Works with GitHub Actions for CI integration

For your stack: AGENTS.md is the primary file. Codex will use it directly.

GCP Agentic Workloads / Vertex AI Agent Builder

Can fetch context from external sources at runtime
Configure agents to pull from central org context
Supports multiple context files merged at inference

For your stack:

Host ORG-AGENTS.md on internal HTTP endpoint
Configure Vertex AI agents to fetch at session start
Repo AGENTS.md files remain lean and repo-specific

Claude Agent SDK / OpenAI SDK

Context passed programmatically
Can inject org context before repo context
Full control over context assembly

For your stack:

# Pseudocode
org_context = fetch("https://internal.yourcompany.com/agents/org-context.md")
repo_context = read_file("AGENTS.md")
full_context = org_context + "\n\n" + repo_context
agent.run(prompt, context=full_context)

Part 7: Action Plan

Week 1: Audit

Find all existing AGENTS.md / CLAUDE.md files across repos
Run the 4-question filter on each
Identify common sections (candidates for org template)
Identify contradictions between repos
Create inventory of what's in each file

Week 2: Create ORG-AGENTS.md

Draft org-wide template with platform team
Include: security requirements, performance standards, tool versions, CI pointers
Host on internal HTTP endpoint (for GCP Agentic Workloads)
Configure Vertex AI agents to fetch at session start

Week 3: Update Repo AGENTS.md Files

Prioritize: highest-traffic repos first
Apply template: org context link + repo-specific sections
Run through 4-question filter
Target: 100-200 lines each

Week 4: Add Linting

Create AGENTS.md linter script
Add to CI (GitHub Actions / Cloud Build)
Fail PRs if AGENTS.md doesn't pass
Add to pre-commit hooks for local dev

Ongoing

Quarterly audit of all AGENTS.md files
Update ORG-AGENTS.md when conventions change
Add guardrails when agent mistakes occur
Delete instructions that don't prevent failures

Build Queue for Tango

AGENTS.md Linter
- Input: AGENTS.md file path
- Output: Pass/fail with specific issues
- Checks: line count, required sections, generic phrases, file references
- Tech: Golang CLI, runs in CI
AGENTS.md Generator
- Input: Repo URL or local path
- Output: Draft AGENTS.md extracted from codebase
- Extract: build commands, test commands, architecture hints
- Tech: Golang CLI, static analysis
ORG-AGENTS.md Server
- Input: HTTP request from Vertex AI agents
- Output: Current org-wide context
- Features: versioning, audit log, update API
- Tech: Golang HTTP server, GCP Cloud Run
AGENTS.md Dashboard
- Input: GitHub org, scans all repos
- Output: Inventory of AGENTS.md files, compliance status, drift detection
- Tech: React frontend, Golang backend

Sources

Research by Cash | April 2026