Sitemap & robots checker · /tools/sitemap-checker

Validate your sitemap and robots.txt

Paste any domain to fetch /robots.txt, probe /sitemap.xml (and sitemap indexes), parse every URL, and surface parser errors before search engines see them.

Enter a domain above to check its sitemap and robots.txt.

How it works

  1. Step 1

    Enter a domain

    Type the root domain — with or without https://. We try /robots.txt, /sitemap.xml, /sitemap_index.xml, and any sitemap declared in robots.txt.

  2. Step 2

    We fetch and parse

    Our server fetches each sitemap, detects sitemap indexes vs. URL sets, and recursively follows child sitemaps to a safe depth of 10 files / 1000 URLs.

  3. Step 3

    See the audit

    You get URL counts, lastmod coverage, changefreq distribution, parser errors, and a parsed view of your robots.txt groups.

More free SEO tools

FAQ

Why do you cap at 1000 URLs?
Most sites have well under 1000 URLs per sitemap. Google itself recommends keeping each sitemap file under 50,000 URLs and 50MB. If you have more than 1000, you should be using a sitemap index file that points to multiple smaller sitemaps — and our tool does follow those.
My sitemap is at /sitemap_index.xml. Will you find it?
Yes. We probe /sitemap.xml, /sitemap_index.xml, /sitemap-index.xml, and any URL declared in a Sitemap: line in robots.txt. If your index points to other sitemaps, we follow them to a depth of 10 files.
What does 'unknown lines' mean in robots.txt?
Lines that aren't recognized directives (User-agent, Allow, Disallow, Sitemap, Host, Crawl-delay). Google has a few extras (Clean-param, Request-rate, Visit-time, Sitemap) and Bing supports a couple more, but most crawlers ignore anything else. Unknown lines aren't necessarily wrong — they may be valid for a specific crawler.