SitemapScan Blog

Blocked by robots.txt but Listed in a Sitemap: Why the Conflict Matters

When a URL is listed in a sitemap but blocked by robots.txt, the site is telling crawlers two different things at once. Here is why that conflict matters and how to audit it correctly.

Why this conflict matters

A sitemap says a URL is important enough to be discovered. A robots.txt block tells crawlers not to fetch the path. That creates an avoidable contradiction in the crawl and indexation layer.

How this usually happens

It often appears after migrations, temporary staging rules, inherited disallow patterns, or sitemap generators that are not aware of robots policies applied elsewhere.

How to audit the conflict

Check whether the robots block is intentional, whether the URL should really be in the sitemap, and whether the conflict affects a few URLs or a whole site section.

About this article

This article is part of the SitemapScan blog and covers XML sitemap, robots.txt, crawlability, or related technical SEO topics.

FAQ

What is this article about?

Blocked by robots.txt but Listed in a Sitemap: Why the Conflict Matters explains a practical technical SEO topic related to XML sitemaps, robots.txt, crawlability, or sitemap validation.

How should this article be used?

Use it as a practical guide, then validate the topic on a live site with SitemapScan and compare it against recent public checks when helpful.

Related pages

Open the full article