citare
GEO spoke — llms.txt

llms.txt guide — what it is and whether it works

llms.txt is the most-debated emerging AI search standard. Some practitioners treat it as essential infrastructure; others dismiss it as performative. The honest assessment sits in the middle — worth publishing, but the leverage is forward- compatibility plus immediate Claude/Perplexity citation, not a guaranteed AIO or ChatGPT lever.

Updated May 2026

TL;DR

  • 1.llms.txt is a plain-text file at yoursite.com/llms.txt providing AI systems with a curated summary of your site, key topics, and priority URLs.
  • 2.Anthropic (Claude) and Perplexity explicitly read it. Google and OpenAI are partial. Adoption trending upward.
  • 3.30 minutes of work; near-zero cost; forward-compatibility bet. Publish it. Don't expect immediate AIO/ChatGPT citation lift.
  • 4.Concise wins. 10-20 curated priority URLs outperform 200-URL exhaustive lists. Factual structured info beats marketing copy.

What llms.txt is

llms.txt is a plain-text file at the root of your website (/llms.txt) that provides AI systems with a curated summary of your site, its key topics, and the pages you most want AI models to attend to.

The format is structured markdown with a defined section pattern:

  • An H1 with the site or organization name
  • A brief description (1-2 sentences) of what the site is
  • Sectioned lists of relevant URLs grouped by purpose (Docs, Research, Optional)

The standard was proposed by Jeremy Howard (Answer.ai) in mid-2024 and gained meaningful adoption through 2025. It's inspired by — but conceptually distinct from — robots.txt and sitemap.xml. The intent: give AI systems a "what should I know about this site" file analogous to how robots.txt gives crawlers a "what can I access" file. It's metadata for AI ingestion specifically, not a crawl-control mechanism.

What llms.txt is NOT

Not a robots.txt replacement

robots.txt controls crawl access. llms.txt provides metadata. Different purposes. You need both.

Not a sitemap replacement

sitemap.xml is comprehensive (every URL + lastmod). llms.txt is curated (priority URLs only). Different scopes.

Not universally consumed

Adoption is real but uneven. Treating it as a guaranteed citation channel misreads where the standard is in 2026.

Adoption status — who reads it

Reads it explicitly

  • Anthropic (Claude) — documented support; Claude grounding references llms.txt
  • Perplexity — incorporates llms.txt content into source ranking
  • Several smaller AI products — emerging adoption across newer LLM-powered tools

Partial / inconsistent

  • Google (AI Overview + Gemini) — variable evidence; no official endorsement but indirect signals suggest partial ingestion
  • OpenAI (ChatGPT) — limited evidence of consumption; not officially endorsed

Not yet consuming

  • Various legacy and smaller AI-search products

Trajectory is toward broader adoption. Current state is patchy. Publishing llms.txt today is a forward-compatibility bet, not an immediate-citation play for AIO or ChatGPT — but immediate citation lift for Claude and Perplexity audiences is real.

The format — sample llms.txt

A complete llms.txt has three to five sections. The convention emerged organically through 2025 and is now reasonably stable.

# Citare

AI search visibility platform measuring brand presence across ChatGPT,
Gemini, Claude, Perplexity, and Google AI Overview. We help B2B brands
track their surface rate, identify citation gaps, and benchmark against
named competitors.

## Docs

- [What is GEO?](https://citare.ai/geo)
- [The four-index reality](https://citare.ai/four-index-reality)
- [AI search vs Google](https://citare.ai/ai-search-vs-google)
- [Rank in Google AI Overview](https://citare.ai/rank-in-google-aio)
- [Rank in ChatGPT](https://citare.ai/rank-in-chatgpt)
- [Rank in Perplexity](https://citare.ai/rank-in-perplexity)
- [Structured data for AI](https://citare.ai/structured-data-for-ai)
- [AI bot crawlers](https://citare.ai/ai-bot-crawlers)

## Research + case studies

- [GEO for Indian businesses](https://citare.ai/india/geo-for-indian-businesses)

## Free tools

- [JSON-LD generator](https://citare.ai/tools/json-ld-generator)
- [AI robots.txt checker](https://citare.ai/tools/ai-robots-checker)
- [llms.txt generator](https://citare.ai/tools/llms-txt-generator)
- [Citation scorer](https://citare.ai/tools/citation-scorer)

## Optional

- [Blog](https://citare.ai/blog)
- [Pricing](https://citare.ai/pricing)

H1 + description

The brand entity and what it is. Factual structured info, not marketing copy.

Docs / Guides

Primary reference content you want AI to attend to. 5-10 priority URLs typically.

Research / Examples

Distinctive content — original data, case studies, benchmarks.

Optional

Content you want available but de-prioritized. Blog, About, supplementary pages.

Where to host: https://yoursite.com/llms.txt. Plain text. UTF-8 encoded. Accessible without authentication. The path is hard-coded by the convention — don't relocate it.

Should you publish one? The honest take

Three reasons yes

  • Cost is near-zero. A single .txt file. 30 minutes of work. No engineering required for most sites.
  • Claude + Perplexity citation lift is immediate. For B2B/technical brands these audiences matter — and llms.txt content feeds their grounding.
  • Forward-compatibility. Platforms trending toward broader respect. Publishing now positions you ahead.

Three caveats

  • !Don't expect immediate AIO/ChatGPT lift. Google and OpenAI consumption is partial. Plan accordingly.
  • !Don't substitute for higher-leverage actions. If choosing between llms.txt and FAQ schema, choose FAQ schema. The latter has measurable near-term effect.
  • !Don't treat as a citation hack. Stuffing competitor names won't get you cited for their queries.

Four common mistakes

1

Treating it as a citation hack

Stuffing competitor names, generic high-volume keywords, or inflated descriptions. AI systems weight credibility; manipulative llms.txt content earns less attention, not more. The standard is metadata, not adversarial keyword stuffing.

2

Marketing copy instead of factual structured info

'Citare is the world's most trusted AI search visibility platform' is marketing copy. 'Citare measures brand presence across ChatGPT, Gemini, Perplexity, and Google AI Overview' is factual structured info. llms.txt rewards the latter.

3

Forgetting to update it

Like sitemap.xml and structured data, llms.txt goes stale. Update when key pages change, when you publish significant new content, or quarterly minimum. Add an audit task to your content workflow.

4

Making it longer than it needs to be

Concise wins. A well-curated llms.txt with 10-20 priority URLs outperforms a 200-URL exhaustive list. AI systems prefer quality signals over quantity. Don't pad to look thorough.

Frequently asked questions

Do AI platforms actually read llms.txt?

Some yes, some partially, some not yet. Anthropic and Perplexity explicitly read it. Google and OpenAI are partial — variable evidence of consumption. Several smaller AI products read it. The honest answer is 'partial adoption today, broader adoption likely in 12-18 months.' Publishing now is forward-compatibility, not immediate-citation lift.

Is llms.txt worth publishing in 2026?

Yes. Three reasons: (1) cost is near-zero — a single .txt file, 30 minutes of work; (2) Claude and Perplexity already cite based on llms.txt content (immediate value for B2B brands); (3) AI platform adoption is trending upward, so publishing now positions you ahead of competitors when broader adoption lands. Don't expect AIO or ChatGPT citation lift specifically.

Will llms.txt replace robots.txt?

No. Different purposes. robots.txt controls crawl access (what bots can fetch). llms.txt provides metadata (what the site is and what content matters). Both will coexist — like robots.txt and sitemap.xml have coexisted for two decades. Publish both.

How is llms.txt different from sitemap.xml?

sitemap.xml is comprehensive (every crawlable URL with lastmod). llms.txt is curated (only the priority pages you want AI to focus on). sitemap.xml is for search-engine crawlers; llms.txt is for AI ingestion. Use both — they complement each other.

How does llms.txt relate to structured data (JSON-LD)?

Different layer. JSON-LD is page-level structured data embedded in HTML for AI parsers to extract specific facts (Organization name, Article author, Product price). llms.txt is site-level metadata telling AI what the whole site is and what content matters. Use both — they don't substitute for each other.

Should I publish llms.txt even if my site is small?

Yes. Cost is near-zero. There's no scale threshold below which llms.txt doesn't make sense. Even a 10-page brochure site benefits from a curated metadata file as adoption grows. For small sites, manual creation in 30 minutes works fine — no tooling required.

Where do I check the llms.txt specification?

The original proposal and current specification are at llmstxt.org. The standard is open and lightweight; the spec fits on one page. It was proposed by Jeremy Howard (Answer.ai) in mid-2024 and gained meaningful adoption through 2025.

Is there a tool to generate llms.txt automatically?

Yes — Citare's free llms.txt generator at /tools/llms-txt-generator takes your URL and produces a starter llms.txt with curated section structure. For sites with structured content (Sanity, WordPress, Webflow), you can also generate llms.txt programmatically from your CMS at build time.

Generate your llms.txt in 30 seconds

Citare's free llms.txt generator builds a starter file from your URL with curated section structure. Customize the priority URLs and ship.

Related