All articles
llms.txtAI SearchAIO

What Is llms.txt? A Practical Guide for 2026

llms.txt is a plain-text file that gives AI assistants a clean map of your most important pages. Here's what it is, the spec, why it helps, and how to make one.

Kaustav Basak·June 17, 2026· 6 min read

llms.txt is a plain-text Markdown file you place at the root of your site, at /llms.txt, that gives AI assistants a curated, easy-to-read map of your most important content. Think of it as a table of contents written for language models instead of crawlers: a short summary of what your site is, followed by clean links to the pages you most want an AI to read and cite.

The format was proposed in 2024 by Jeremy Howard of Answer.AI. It is a community convention, not an official standard backed by Google or OpenAI, but adoption has grown quickly among documentation sites, SaaS products, and developer tools. This guide covers what the file actually contains, why it helps, and how to build and validate one.

What problem does llms.txt solve?

When an AI assistant tries to understand your site, it hits the same walls a human skimmer does, only worse. A typical HTML page is buried in navigation, cookie banners, ads, related-post widgets, and JavaScript that the model may never execute. Pulling a clean answer out of that mess wastes the model's limited context window on noise.

llms.txt cuts through this. Instead of guessing which of your 4,000 URLs matter, the model reads one short file that says, in effect, here is who we are and here are the twelve pages worth your time. It is the difference between handing someone a labeled index and pointing them at a filing cabinet.

It is worth being clear about what llms.txt is not. It is not robots.txt, which tells crawlers what they may not access. It is not a sitemap.xml, which lists every URL for completeness. llms.txt is the opposite of completeness: it is a deliberate shortlist, written in prose a model can read directly.

What does the llms.txt spec look like?

The format is intentionally simple. It is standard Markdown with a loose structure, so any model that can read text can parse it. A valid file has these parts, in order:

SectionRequiredWhat it is
H1 titleYesA single # heading with your site or project name
Blockquote summaryRecommendedA short > line describing what the site is about
Free textOptionalA paragraph or two of extra context, no headings
Link sectionsYes## headings grouping curated links as a Markdown list
Optional sectionOptionalAn ## Optional heading for links a model can skip

Each link is a normal Markdown list item: - [Page name](https://example.com/page): one-line description. The description after the colon is what makes the file useful, since it tells the model what it will find before it spends a request fetching the page.

Here is a minimal but complete example:

# Acme Analytics

> Acme is a privacy-first web analytics tool for small teams.

## Docs

- [Quickstart](https://acme.com/docs/quickstart): Install and see your first dashboard in five minutes.
- [API reference](https://acme.com/docs/api): Full REST endpoints with auth and rate limits.

## Guides

- [Self-hosting](https://acme.com/guides/self-host): Run Acme on your own server with Docker.

## Optional

- [Changelog](https://acme.com/changelog): Release notes, useful but not essential.

There is also a companion convention, llms-full.txt, which inlines the full Markdown content of those pages into a single file so a model can read everything in one fetch. That file gets large fast, so most sites publish the lean llms.txt and let models follow links as needed.

Why does llms.txt help AI assistants?

The honest answer is that it helps in a few concrete ways, and you should be skeptical of anyone promising it as a ranking trick.

First, it saves context. Models work inside a fixed token budget. A file that points straight at clean, relevant pages means less of that budget is spent parsing layout cruft and more is spent on your actual content.

Second, it improves accuracy. When you write the one-line descriptions yourself, you control how your pages are framed. A model is far less likely to misrepresent your product if you have told it plainly what each page covers.

Third, it signals intent. Publishing the file is a clear statement that you want AI tools to read and reference your content, which is increasingly the point. Being referenced by assistants like ChatGPT and Perplexity is becoming its own channel, and llms.txt is one piece of making your site easy to quote. It pairs naturally with the on-page work in how to get cited by ChatGPT, Perplexity and AI Overviews.

Be realistic about the ceiling, though. As of mid-2026 the major AI crawlers have not committed to reading llms.txt the way they honor robots.txt, and some products that fetch pages live may ignore it entirely. The file is low-effort insurance, not a guarantee. It belongs in the same toolkit as structured data and clean semantic HTML, all of which fall under generative engine optimization.

How do I create an llms.txt file?

You can write one by hand in a few minutes. The steps:

  1. Pick your highest-value pages. Documentation, pricing, key product pages, your best guides. Aim for quality over coverage, usually 10 to 30 links, not your entire sitemap.
  2. Write the title and summary. One H1 with your name, one blockquote that explains what you do in a sentence.
  3. Group the links under `##` headings. Cluster by purpose: Docs, Guides, Products, About. Give each link a short, factual description after the colon.
  4. Move nice-to-have links under `## Optional`. This tells a model what it can safely skip if it is short on budget.
  5. Save it as `llms.txt` and upload to your site root so it resolves at https://yourdomain.com/llms.txt. Serve it as text/plain or text/markdown.

If you would rather not start from a blank file, the free llms.txt generator builds a spec-compliant file from your URL, picking sensible pages and writing first-draft descriptions you can edit.

How do I validate my llms.txt?

A file that exists but is malformed helps no one, so check two things. The structure: it must have an H1 title and at least one link section, links must be valid Markdown, and the URLs must actually resolve. And the delivery: the file must be reachable at the root path and returned as plain text, not wrapped in your site's HTML template.

The quickest way to confirm both is the llms.txt validator, which fetches your file, checks it against the spec, and flags broken links, missing sections, or wrong content types. Re-run it whenever you restructure your site, since a stale file pointing at dead URLs is worse than none.

Should every site have one?

If you have content you want AI assistants to understand and cite, yes, and the cost is close to zero. Documentation sites and developer tools get the most obvious benefit, but any business that wants accurate AI representation gains from telling models exactly where to look. The risk is minimal: a well-formed file can only clarify, never penalize.

Keep it lean, keep the descriptions honest, and revisit it when your site changes. A short, accurate file beats a sprawling one that no longer matches reality.

Want to see how AI-ready your whole site is, not just this one file? A free SEO audit checks your structure, machine-readable signals, and content clarity in about a minute, no signup required.

KB
Written by
Kaustav Basak

Kaustav Basak is the creator of SEO AI Audits, a free AI-powered SEO toolkit. He writes about technical SEO, Core Web Vitals, and how search is changing in the age of AI assistants.

Put this into practice — free

Run a complete, AI-powered SEO audit of your site in about a minute. No signup.

No signup needed. Results in under 60 seconds.

Keep reading