citare

Citare Tools · Free

Sitemap Health Checker

Paste your site URL — we discover the sitemap (default location + any Sitemap: directive in your robots.txt), parse it, follow sitemap-index nesting, and report URL count,<lastmod> coverage, and AI-search-readiness flags.

Free. No signup. Cached 24h.

Why <lastmod> coverage matters for AI search

AI crawlers use <lastmod>tags to decide what’s fresh and worth re-fetching. A sitemap with no <lastmod> tags forces crawlers to re-evaluate every URL on every visit — wasting budget and slowing the discovery of recently-changed content.

Sites with strong <lastmod> coverage (90%+) get faster AI-citation refresh cycles, which matters when you publish new content meant to be cited.

Frequently asked

What does the Sitemap Health Checker actually check?

Paste your URL into the form above. The tool first discovers your sitemap by checking the default location (/sitemap.xml) and any Sitemap: directive declared in your robots.txt — then fetches the file, parses the XML, follows sitemap-index nesting up to three levels deep, and reports: total URL count, lastmod coverage percentage (how many URLs have a <lastmod> tag), unique URL deduplication, malformed-entry detection, and AI-search-readiness flags (e.g. lastmod coverage <80%, mixed http/https origins, broken sitemap-index links). Results are cached for 24 hours per URL.

What is lastmod coverage and why does it matter for AI search?

<lastmod> is the optional XML field in a sitemap entry that tells crawlers when a URL was last modified. Coverage is the percentage of your URLs that have a <lastmod> tag. AI crawlers (and Googlebot) use <lastmod> to decide which URLs are fresh and worth re-fetching. A sitemap with no <lastmod> tags forces every crawler to re-evaluate every URL on every visit — wasting crawl budget and slowing the discovery of newly-published content. Sites with 90%+ <lastmod> coverage get noticeably faster AI-citation refresh cycles, which matters when you publish new content meant to be cited by ChatGPT, Gemini, or Perplexity.

How does the tool find my sitemap if I haven't told it where to look?

Two-step discovery. First, the tool fetches your robots.txt and looks for a Sitemap: directive — that's the canonical place to declare a sitemap location, and Googlebot/AI crawlers check there before guessing. If no Sitemap: directive is found, the tool falls back to the default location at /sitemap.xml on your domain. If both fail, the result is reported as "sitemap not found," which itself is a finding worth fixing — it means your robots.txt doesn't declare a sitemap and you don't have one at the conventional default path.

What is a sitemap index and why does it nest?

A sitemap index (<sitemapindex>) is a sitemap-of-sitemaps. Sites with more than 50,000 URLs (the per-file limit set by the sitemap protocol) split their content across multiple sitemap files and reference them all from a top-level index. The Citare tool detects sitemap-index files, follows each child sitemap up to three levels of nesting, aggregates URL counts across the entire tree, and reports coverage at the consolidated level. Most large sites — e-commerce stores, news publishers, content platforms — use sitemap indexes; the tool handles them transparently.

Does my sitemap matter for ChatGPT, Gemini, and Perplexity grounding?

Yes, indirectly. AI search platforms ground answers from search-engine indexes (ChatGPT uses Bing's index for live grounding; Perplexity uses its own index built atop standard sitemap signals; Gemini uses Google's index). A sitemap that's clean, complete, and has good <lastmod> coverage feeds those upstream indexes faster, which means newly-published content shows up in AI grounding sooner. The opposite is also true: if your sitemap is malformed, missing URLs, or has stale <lastmod> dates, the indexes are slower to refresh and AI citations lag. Sitemap health is upstream of AI visibility.

More free GEO tools