Blogs/Technical

Documentation Rots. Here's How to Stop It.

A deep dive into why documentation drifts from code, the patterns that cause it, and the automation strategies that actually work to fix it.

F
Faizan KhanAuthor
12 min read

TL;DR

Every codebase I've ever joined has had the same problem: the docs lie. Not maliciously. Nobody sat down and decided to deceive future developers. But somewhere between launch day and now, the code evolved and the docs didn't. The README still references a config file that was renamed six months ago. The API docs describe endpoints that no longer exist. The "Getting Started" guide requires a dependency that was deprecated in Node 18.

This isn't a people problem. It's a systems problem. And I want to talk about how to actually fix it.

The Documentation Drift Problem

Let's be honest about what happens in real engineering teams.

You're shipping a feature. The deadline was yesterday. You've got the PR up, tests are green, and you're about to merge. Somewhere in the back of your mind, you know the docs need updating. But the docs live in a different repo. Or a wiki. Or a Notion page you haven't touched in months.

"I'll update it after the deploy," you tell yourself.

You won't. I won't. Nobody does. And that's okay, because the system is designed to fail.

Here's the thing: documentation and code are two separate artifacts with two separate lifecycles. Code gets reviewed, tested, deployed. Docs... exist. Maybe someone looks at them when onboarding new hires. Maybe.

This separation is the root cause of documentation rot. And it's been the default model for decades.

Measuring the Damage

Before we talk solutions, let's talk about what documentation drift actually costs. Because if you can't measure it, you can't prioritize fixing it.

The Support Channel Heuristic

Go look at your team's Slack. Search for questions that start with "How do I..." or "Where is..." or "Does anyone know...".

Count them. Categorize them. I'll bet you a mass of documentation that at least 40% of those questions have answers that should be in your docs but aren't, or are answered incorrectly by outdated docs.

Here's a quick script to get a sense of the damage in a public Discord or Slack export:

Python
1import json
2from collections import Counter
3
4def analyze_support_questions(messages_json):
5 """
6 Analyze exported Slack/Discord messages for documentation gaps.
7 Looks for question patterns that indicate missing or outdated docs.
8 """
9 question_patterns = [
10 "how do i", "how to", "where is", "where can i find",
11 "does anyone know", "is there a way to", "what's the",
12 "documentation says", "docs say", "according to the docs"
13 ]
14
15 questions = []
16 doc_complaints = []
17
18 with open(messages_json, 'r') as f:
19 messages = json.load(f)
20
21 for msg in messages:
22 text = msg.get('text', '').lower()
23
24 # Find questions
25 for pattern in question_patterns[:7]:
26 if pattern in text:
27 questions.append(msg)
28 break
29
30 # Find doc-specific complaints
31 for pattern in question_patterns[7:]:
32 if pattern in text:
33 doc_complaints.append(msg)
34 break
35
36 print(f"Total messages analyzed: {len(messages)}")
37 print(f"Questions found: {len(questions)}")
38 print(f"Doc-related complaints: {len(doc_complaints)}")
39 print(f"Question rate: {len(questions)/len(messages)*100:.1f}%")
40
41 return questions, doc_complaints
42
43# Example usage:
44# questions, complaints = analyze_support_questions('slack_export.json')

This is crude, but it works. When I ran this against a mid-sized open source project's Discord, 23% of all messages were questions, and about a third of those referenced documentation being wrong or missing.

The Onboarding Time Test

Track how long it takes a new developer to make their first meaningful commit. Not a typo fix. An actual feature or bug fix.

In teams with accurate, current documentation, this is typically 1-3 days. In teams with documentation debt, it's often 1-2 weeks. That's not because new hires are slow. It's because they spend their first week asking questions, reading stale docs, hitting dead ends, and reverse-engineering things from the code.

At a $150k/year salary, a week of lost productivity is roughly $3,000. Per hire. Every time.

The Trust Decay Function

This one's harder to measure but critically important: once developers learn that docs can't be trusted, they stop reading them entirely.

I've seen this in my own behavior. After getting burned a few times by outdated examples, I now default to reading source code instead of documentation. I grep through the codebase, find usage examples, and figure it out myself. That works, but it's slower than docs should be.

And for external developers using your API? They don't have access to your source code. They just... leave.

Why the Traditional Solutions Fail

Teams have been trying to solve documentation drift for years. Here are the common approaches and why they don't work:

"We'll make it part of the PR checklist"

Sure. Add a checkbox that says "Updated documentation if needed."

Developers will check it. They'll check it even when they didn't update the docs. They'll check it because the PR is blocking and they need to ship. The checkbox becomes meaningless within a month.

I've worked at companies with elaborate PR templates, mandatory documentation fields, and automated checks for doc updates. The result? Developers learned to write "N/A" or "No doc changes needed" in the fields. The system was gamed immediately.

"We'll do quarterly documentation audits"

This sounds reasonable until you realize what it means in practice.

Someone (usually a tech writer or a senior engineer who drew the short straw) spends a week going through every doc page and checking it against the current codebase. They find dozens of issues. They file tickets. Some get fixed. Most don't, because there's always something more urgent.

Three months later, you do it again. The same issues are still there, plus new ones.

Audits are reactive. By the time you find the problem, it's already frustrated users and cost support hours.

"We'll use a docs-as-code approach"

This is closer to the right answer, but it's not enough on its own.

Docs-as-code means treating documentation like source code: version control, code review, CI/CD. It's a huge improvement over wikis and Google Docs because it puts docs in the same workflow as code.

But it still requires humans to remember to update the docs. And humans forget. Always.

The Automation Approach That Actually Works

Here's the insight that changed how I think about documentation: documentation should be derived from code, not maintained alongside it.

Think about it. Your code is the source of truth. It defines what your system actually does. Documentation is just a human-readable explanation of that truth.

So why are we writing docs by hand and hoping they stay synchronized? Why aren't we generating them from the source of truth?

What Can Be Automated

Let me be specific about what automation can and can't do today.

Fully automatable:

  • API reference docs (endpoints, parameters, response shapes)
  • Type definitions and interfaces
  • Function signatures and their documentation
  • Configuration file schemas
  • Database schema documentation
  • Dependency lists and versions
  • Code examples extracted from test files

Partially automatable (needs human review):

  • Conceptual explanations ("what is this and why does it exist")
  • Architecture overviews
  • Getting started guides
  • Tutorials and walkthroughs

Not automatable (and that's fine):

  • Strategic decisions ("why we chose X over Y")
  • Best practices and recommendations
  • Troubleshooting guides based on support experience

The key insight is that the fully automatable stuff is exactly the stuff that rots fastest. API parameters change. Config options get added. Types evolve. This is the high-velocity documentation that humans can't keep up with.

Building an Automation Pipeline

Here's a practical architecture for automated documentation that you can implement today, regardless of what tools you use:

YAML
1# Example GitHub Action for doc automation
2name: Documentation Sync
3
4on:
5 push:
6 branches: [main]
7 paths:
8 - 'src/**'
9 - 'api/**'
10 - 'config/**'
11
12jobs:
13 generate-docs:
14 runs-on: ubuntu-latest
15 steps:
16 - uses: actions/checkout@v4
17
18 - name: Extract API Schema
19 run: |
20 # Generate OpenAPI spec from code annotations
21 npx openapi-generator generate \
22 -i ./src/api \
23 -o ./docs/api-reference \
24 -g markdown
25
26 - name: Extract TypeScript Types
27 run: |
28 # Generate type documentation
29 npx typedoc \
30 --out ./docs/types \
31 --readme none \
32 ./src/types
33
34 - name: Extract Config Schema
35 run: |
36 # Generate config documentation from JSON schema
37 npx json-schema-to-markdown \
38 ./config/schema.json \
39 > ./docs/configuration.md
40
41 - name: Generate Code Examples
42 run: |
43 # Extract examples from test files
44 node ./scripts/extract-examples.js \
45 --source ./tests \
46 --output ./docs/examples
47
48 - name: Commit Updated Docs
49 run: |
50 git config user.name "Documentation Bot"
51 git config user.email "docs@example.com"
52 git add docs/
53 git diff --staged --quiet || git commit -m "docs: auto-update from code changes"
54 git push

This is a simplified version, but it illustrates the pattern: code changes trigger documentation regeneration, and the updated docs are committed automatically.

The AST Parsing Approach

For more sophisticated documentation extraction, you need to understand the code's structure, not just its surface syntax. This means working with Abstract Syntax Trees (ASTs).

Here's a Node.js example that extracts function documentation from TypeScript:

TypeScript
1import * as ts from 'typescript';
2import * as fs from 'fs';
3
4interface FunctionDoc {
5 name: string;
6 description: string;
7 parameters: { name: string; type: string; description: string }[];
8 returnType: string;
9 examples: string[];
10}
11
12function extractFunctionDocs(filePath: string): FunctionDoc[] {
13 const sourceCode = fs.readFileSync(filePath, 'utf-8');
14 const sourceFile = ts.createSourceFile(
15 filePath,
16 sourceCode,
17 ts.ScriptTarget.Latest,
18 true
19 );
20
21 const docs: FunctionDoc[] = [];
22
23 function visit(node: ts.Node) {
24 if (ts.isFunctionDeclaration(node) && node.name) {
25 const doc = extractDocFromFunction(node, sourceFile);
26 if (doc) docs.push(doc);
27 }
28 ts.forEachChild(node, visit);
29 }
30
31 visit(sourceFile);
32 return docs;
33}
34
35function extractDocFromFunction(
36 node: ts.FunctionDeclaration,
37 sourceFile: ts.SourceFile
38): FunctionDoc | null {
39 const name = node.name?.getText(sourceFile) || 'anonymous';
40
41 // Extract JSDoc comments
42 const jsDocTags = ts.getJSDocTags(node);
43 const description = ts.getJSDocCommentsAndTags(node)
44 .filter(ts.isJSDoc)
45 .map(doc => doc.comment)
46 .filter(Boolean)
47 .join('\n');
48
49 // Extract parameters
50 const parameters = node.parameters.map(param => ({
51 name: param.name.getText(sourceFile),
52 type: param.type?.getText(sourceFile) || 'any',
53 description: getParamDescription(param, jsDocTags)
54 }));
55
56 // Extract return type
57 const returnType = node.type?.getText(sourceFile) || 'void';
58
59 // Extract @example tags
60 const examples = jsDocTags
61 .filter(tag => tag.tagName.getText(sourceFile) === 'example')
62 .map(tag => tag.comment?.toString() || '');
63
64 return { name, description, parameters, returnType, examples };
65}
66
67function getParamDescription(
68 param: ts.ParameterDeclaration,
69 tags: readonly ts.JSDocTag[]
70): string {
71 const paramTag = tags.find(
72 tag => tag.tagName.getText() === 'param' &&
73 tag.comment?.toString().startsWith(param.name.getText())
74 );
75 return paramTag?.comment?.toString().split(' ').slice(1).join(' ') || '';
76}
77
78// Generate markdown from extracted docs
79function generateMarkdown(docs: FunctionDoc[]): string {
80 return docs.map(doc => `
81## \`${doc.name}\`
82
83${doc.description}
84
85### Parameters
86
87${doc.parameters.length ? doc.parameters.map(p =>
88 `- \`${p.name}\` (${p.type}): ${p.description}`
89).join('\n') : 'None'}
90
91### Returns
92
93\`${doc.returnType}\`
94
95${doc.examples.length ? `### Examples\n\n\`\`\`typescript\n${doc.examples.join('\n')}\n\`\`\`` : ''}
96`).join('\n---\n');
97}
98
99// Usage
100const docs = extractFunctionDocs('./src/api/handlers.ts');
101const markdown = generateMarkdown(docs);
102fs.writeFileSync('./docs/api-reference.md', markdown);

This gives you complete control over what gets documented and how. The extraction logic runs on every commit, ensuring docs stay synchronized with code.

Handling Documentation That Can't Be Automated

For conceptual documentation that requires human input, the best approach is a hybrid model:

  1. Generate scaffolding automatically. When a new file or module is created, generate a documentation stub with the basics filled in (function signatures, types, etc.) and TODO markers for the parts that need human explanation.
  2. Track documentation coverage. Just like code coverage, measure what percentage of your public APIs have human-written descriptions. Set a threshold and fail CI if it drops too low.
  3. Use LLMs for first drafts. Modern language models can generate decent first-draft documentation from code context. They're not perfect, but they're a lot faster than writing from scratch. Have humans review and refine.

Here's a simple documentation coverage script:

Python
1#!/usr/bin/env python3
2"""
3Check documentation coverage for public APIs.
4Fails if coverage drops below threshold.
5"""
6
7import ast
8import sys
9from pathlib import Path
10
11def get_public_functions(filepath: Path) -> list[str]:
12 """Extract public function names (not starting with _)."""
13 with open(filepath) as f:
14 tree = ast.parse(f.read())
15
16 functions = []
17 for node in ast.walk(tree):
18 if isinstance(node, ast.FunctionDef):
19 if not node.name.startswith('_'):
20 functions.append(node.name)
21 return functions
22
23def has_docstring(filepath: Path, func_name: str) -> bool:
24 """Check if function has a non-empty docstring."""
25 with open(filepath) as f:
26 tree = ast.parse(f.read())
27
28 for node in ast.walk(tree):
29 if isinstance(node, ast.FunctionDef) and node.name == func_name:
30 docstring = ast.get_docstring(node)
31 return bool(docstring and len(docstring.strip()) > 10)
32 return False
33
34def check_coverage(src_dir: str, threshold: float = 0.8) -> bool:
35 """Check if documentation coverage meets threshold."""
36 src_path = Path(src_dir)
37
38 total_functions = 0
39 documented_functions = 0
40
41 for py_file in src_path.rglob('*.py'):
42 if '__pycache__' in str(py_file):
43 continue
44
45 functions = get_public_functions(py_file)
46 for func in functions:
47 total_functions += 1
48 if has_docstring(py_file, func):
49 documented_functions += 1
50 else:
51 print(f"Missing docstring: {py_file}:{func}")
52
53 if total_functions == 0:
54 print("No public functions found")
55 return True
56
57 coverage = documented_functions / total_functions
58 print(f"\nDocumentation coverage: {coverage:.1%}")
59 print(f"Threshold: {threshold:.1%}")
60
61 if coverage < threshold:
62 print(f"FAIL: Coverage below threshold")
63 return False
64
65 print("PASS: Coverage meets threshold")
66 return True
67
68if __name__ == '__main__':
69 src_dir = sys.argv[1] if len(sys.argv) > 1 else './src'
70 threshold = float(sys.argv[2]) if len(sys.argv) > 2 else 0.8
71
72 success = check_coverage(src_dir, threshold)
73 sys.exit(0 if success else 1)

The Practical Playbook

Okay, enough theory. Here's what you should actually do, in order:

Week 1: Measure the Problem

Before you fix anything, quantify the damage. Run the support channel analysis. Track onboarding times. Survey your team about documentation pain points.

You need this data to justify the investment in automation, and to measure improvement later.

Week 2: Set Up CI for Reference Docs

Start with the easy wins: API references, type documentation, and config schemas. These can be fully automated with existing tools:

  • OpenAPI/Swagger for REST APIs
  • TypeDoc for TypeScript
  • Sphinx autodoc for Python
  • Javadoc for Java
  • rustdoc for Rust

Wire these into your CI pipeline so they run on every merge to main.

Week 3: Implement Documentation Coverage Checks

Add a coverage check that fails CI if public APIs lack documentation. Start with a low threshold (50%) and ratchet it up over time.

This creates gentle pressure to document new code without blocking existing work.

Week 4: Add Documentation Staleness Detection

Write a script that compares the last-modified date of documentation files against the code they document. Flag any docs that haven't been updated in 90+ days when their corresponding code has changed.

Bash
1#!/bin/bash
2# Find stale documentation
3
4for doc in docs/*.md; do
5 # Extract the source file this doc covers
6 # (Assumes a naming convention like docs/api.md -> src/api/)
7 source_dir="src/$(basename "$doc" .md)"
8
9 if [ -d "$source_dir" ]; then
10 doc_modified=$(stat -f %m "$doc")
11 source_modified=$(find "$source_dir" -type f -name "*.ts" -exec stat -f %m {} \; | sort -rn | head -1)
12
13 if [ "$source_modified" -gt "$doc_modified" ]; then
14 days_stale=$(( (source_modified - doc_modified) / 86400 ))
15 if [ "$days_stale" -gt 90 ]; then
16 echo "STALE ($days_stale days): $doc"
17 fi
18 fi
19 fi
20done

Ongoing: Review and Improve

Automation isn't set-and-forget. Review the generated docs periodically. Improve your extraction logic. Add new automation as you find patterns in what rots fastest.

What I Got Wrong

I want to be honest about the limitations here.

Early on, I thought automation could replace human documentation entirely. It can't. Automated docs are accurate, but they're often not good. They lack context, narrative, and the kind of insight that comes from understanding why something was built, not just what it does.

The goal isn't to eliminate human documentation work. It's to eliminate the maintenance work, the endless treadmill of keeping things synchronized, so humans can focus on the high-value explanatory writing that automation can't do.

I also underestimated how much organizational buy-in matters. The best automation pipeline in the world won't help if developers don't trust it. Roll things out gradually. Start with low-stakes documentation. Build confidence before automating your most critical docs.

The Path Forward

Documentation drift is a systems problem, and it requires a systems solution. Manual processes, no matter how well-intentioned, will always break down under deadline pressure.

The good news is that the tools for automation are mature and accessible. AST parsing, OpenAPI generation, CI pipelines... none of this is cutting-edge technology. It's just underutilized.

If you take one thing from this post, let it be this: stop trying to discipline humans into updating docs, and start building systems that make outdated docs impossible.

Your future self, reading accurate documentation six months from now, will thank you.


If you're building documentation automation and want to share what's working (or not working), I'd love to hear about it. The techniques in this post are things I've learned from shipping real systems, and I'm always looking to learn more.