SitemapScan

Robots Signals

Robots Signals is a public archive of user-agent declarations found in robots.txt across recent sitemap checks. It helps separate search crawlers, AI bots, assistants, ads, publishers, commerce, security, and the long tail of crawler families in the current 7 days snapshot.

Snapshot window: 7 days.

What robots signals tell you

These pages do not show access-log traffic. Instead, they show which crawler families site owners explicitly mention in robots.txt. That makes the dataset useful for understanding intent: search indexing intent, AI access policy, syndication posture, monitoring behavior, platform verification, and the long tail of operational bot handling.

How to use this archive

Use the overview page when you want to see the top families and the top raw user-agent lines. Use subgroup pages when you want a cleaner long-tail landing page around one crawler family, such as search crawlers, AI crawlers, assistant bots, or security bots.

Related pages

Why the 7 days view matters

The 7-day window is useful when you want the freshest visible robot-family declarations in the public archive.

FAQ

What are Robots Signals on SitemapScan?

Robots Signals is a public aggregation view of which user-agent families appear in robots.txt across recent sitemap checks. It groups raw user-agent lines into search, AI, assistants, ads, publishers, monitoring, security, commerce, and other bot families.

Does Robots Signals measure real crawler traffic?

No. It measures robots.txt declarations found during public checks. It tells you which bots are mentioned, not how often they actually visited a site.

What does the 7 days window change?

It changes the public archive window used for aggregation. A shorter window shows fresher behavior, while a longer one shows more stable patterns and a broader long tail of agents.

Open the live interactive Robots Signals view