Must Read: Google Robots.txt Parsing Weirdness

By Angsuman Chakraborty, Gaea News Network
Monday, July 7, 2008

Google search bot (Googlebot) parses robots.txt file to find excluded sections of the site like any good webbots. However unlike the other bots, Google bot behaves differently when you have a section for all bots (*) as well as a section specifically for Googlebot.

Let’s say you have a section in robots.txt for all bots beginning with:
User-Agent: *

Let’s also assume that you do not have any section specifically targeted to Googlebot. In this case Google bot complies with the global directives (applicable to all bots). However now add a section specifically for Googlebot like this:
User-Agent: Googlebot

I assumed that it will comply with both the directives for all bots (*) as well as the directives specifically for Googlebot. As per robots.txt checker in Google Webmaster tools, in such case Googlebot only complies with the directives specifically targeted for Googlebot and ignores the global directives for all bots (*), even when the directives are non-overlapping.
Other bots including that of Google do not demonstrate this idiosyncrasy.

Discussion
July 27, 2010: 7:27 am

I really enjoy this blog. I find it to be refreshing and very informative. I wish there were more blogs like it.

July 7, 2008: 11:44 am

For anyone interested, here is the user guide for robots.txt:

https://www.robotstxt.org/robotstxt.html

YOUR VIEW POINT
NAME : (REQUIRED)
MAIL : (REQUIRED)
will not be displayed
WEBSITE : (OPTIONAL)
YOUR
COMMENT :