Must Read: Google Robots.txt Parsing Weirdness
By Angsuman Chakraborty, Gaea News NetworkMonday, July 7, 2008
Google search bot (Googlebot) parses robots.txt file to find excluded sections of the site like any good webbots. However unlike the other bots, Google bot behaves differently when you have a section for all bots (*) as well as a section specifically for Googlebot.
Let’s say you have a section in robots.txt for all bots beginning with:
User-Agent: *
Let’s also assume that you do not have any section specifically targeted to Googlebot. In this case Google bot complies with the global directives (applicable to all bots). However now add a section specifically for Googlebot like this:
User-Agent: Googlebot
I assumed that it will comply with both the directives for all bots (*) as well as the directives specifically for Googlebot. As per robots.txt checker in Google Webmaster tools, in such case Googlebot only complies with the directives specifically targeted for Googlebot and ignores the global directives for all bots (*), even when the directives are non-overlapping.
Other bots including that of Google do not demonstrate this idiosyncrasy.
July 27, 2010: 7:27 am
I really enjoy this blog. I find it to be refreshing and very informative. I wish there were more blogs like it. |
July 22, 2008: 2:54 am
One more reference from Google Webmasters Tools https://www.google.com/support/webmasters/bin/answer.py?answer=40360&query=robot&topic=&type= |
Nino Natividad