llms.txt Isn't Enough
llms.txt solves discovery. Content negotiation solves consumption. One of these matters 27x more than the other.
TL;DR
Someone pastes a docs link into Claude Code, cursor or even chatGPT. What happens next?
The agent fetches the page. It gets HTML—divs, React hydration markers, navigation sidebars, cookie consent banners, and somewhere in there, the actual content. The agent parses it. Works fine. But you just burned 14,000 tokens on a page that's 500 tokens of useful information.
llms.txt doesn't fix this.
llms.txt is a discovery mechanism. It tells agents what pages exist. But most AI interactions don't start with "give me an index of all docs." They start with a URL. And when agents fetch that URL, they get HTML.
Here's the token cost for a typical docs page:
| Format | Size | Tokens |
|---|---|---|
| HTML | 58KB | ~14,500 |
| Markdown | 2.1KB | ~525 |
27x difference. At scale, this is the gap between AI integrations being viable or not.
Content Negotiation
Use basic headers.
1GET /docs/authentication HTTP/1.12Accept: text/markdownServer sees Accept: text/markdown, returns markdown. Same URL, different representation. Browsers send Accept: text/html, they get HTML. AI agents send Accept: text/markdown, they get markdown.
1# Browser2curl https://docs.example.com/quickstart3# → HTML4
5# AI agent6curl -H "Accept: text/markdown" https://docs.example.com/quickstart7# → MarkdownNo special URLs. No .md suffix. Standard HTTP.
Implementation
Middleware checks the Accept header:
1const acceptHeader = req.headers.get('accept') || '';2if (acceptHeader.includes('text/markdown')) {3 return NextResponse.rewrite(markdownApiUrl);4}If the request wants markdown, serve markdown. If not, HTML.
Gotchas
HEAD requests. Agents send HEAD to check headers before downloading:
1if (req.method !== 'GET' && req.method !== 'HEAD') {2 return res.status(405).json({ error: 'Method not allowed' });3}Next.js middleware rewrites don't preserve query params. Pass data via headers:
1response.headers.set('x-subdomain', subdomain);2response.headers.set('x-markdown-path', pathname);Cache headers matter. AI agents respect caching:
1Cache-Control: public, s-maxage=3600, stale-while-revalidate=86400Discovery Headers
Content negotiation needs discovery. Add these to every response:
1Link: <https://docs.example.com/llms.txt>; rel="llms-txt"2X-Llms-Txt: https://docs.example.com/llms.txtAgents can HEAD any page and find your llms.txt without downloading content.
llms.txt Still Matters
llms.txt isn't useless—it's just not the whole picture. We wrote a full guide on making your docs AI-readable a few weeks ago. The short version:
1# Project Docs2
3- [Quickstart](https://docs.example.com/quickstart.md): Get started in 5 minutes4- [Auth](https://docs.example.com/auth.md): API authenticationGood for agents that need to explore. llms-full.txt concatenates everything for agents that want the whole picture.
But neither helps when someone pastes a single URL into an AI chat. Content negotiation does.
Test Your Docs
1# llms.txt exists?2curl https://your-docs.com/llms.txt3
4# Content negotiation works?5curl -H "Accept: text/markdown" https://your-docs.com/some-page6
7# Discovery headers present?8curl -I https://your-docs.com/some-page | grep -i llmsMost sites fail all three.
The Point
The AI-readable web is being rebuilt. We obsess over SEO for Google. We optimize for crawlers that haven't changed in 20 years. But the new crawlers—the ones that answer questions and write code—are getting HTML soup.
llms.txt helps agents find pages. Content negotiation makes reading them 27x cheaper.
We rolled this out across all Docsalot sites this week—/llms.txt, /llms-full.txt, discovery headers, content negotiation. If you're curious, try curl -H "Accept: text/markdown" against any page on solid-docs.docsalot.dev.
The llms.txt spec is at llmstxt.org. It's short. Read it.
More Articles to Read
How to Make Your Documentation AI Readable (A Practical Guide)
Your docs will be read by AI agents more than humans. Here's how to structure llms.txt, serve markdown versions, and actually get found by AI tools.
install.md reinvents Gherkin, poorly
A new 'standard' for AI-powered installation has emerged. But is it solving a real problem, or is it a solution that wouldn't exist if building things wasn't so cheap now?
Your Documentation Is Already Lying to You
I spent three years watching documentation rot. Here's why it happens, what's actually changing, and the uncomfortable truth about keeping docs alive.