Must Read: Google Robots.txt Parsing Weirdness

Must Read: Google Robots.txt Parsing Weirdness

Google News

Google News

RELATED NEWS

OP-ED by Vice President Joe Biden in the New York Times: "What we must do for Iraq now"

EU-U.S. Summit Joint Statement

OP-ED by Vice President Jow Biden in the New York Times: "What we must do for Iraq now"

Manmohan Singh asks universities to groom global leaders

Mamata to take up filmmakers' tax woes with government

Brinda Karat demands Yeddyurappa's resignation over land scam row

Now, poll bank accounts must for candidates: Moily

Jamia students shouldn't fall into communal trap: Najeeb Jung

Fletcher's 'three mantras' of how to dislodge Ponting

If not JPC, think of substitute, CPI-M tells government

By Angsuman Chakraborty, Gaea News Network
Monday, July 7, 2008

Google search bot (Googlebot) parses robots.txt file to find excluded sections of the site like any good webbots. However unlike the other bots, Google bot behaves differently when you have a section for all bots (*) as well as a section specifically for Googlebot.

Let’s say you have a section in robots.txt for all bots beginning with:
User-Agent: *

Let’s also assume that you do not have any section specifically targeted to Googlebot. In this case Google bot complies with the global directives (applicable to all bots). However now add a section specifically for Googlebot like this:
User-Agent: Googlebot

I assumed that it will comply with both the directives for all bots (*) as well as the directives specifically for Googlebot. As per robots.txt checker in Google Webmaster tools, in such case Googlebot only complies with the directives specifically targeted for Googlebot and ignores the global directives for all bots (*), even when the directives are non-overlapping.
Other bots including that of Google do not demonstrate this idiosyncrasy.

NAME :	(REQUIRED)
MAIL :	(REQUIRED) will not be displayed
WEBSITE :	(OPTIONAL)

YOUR COMMENT :
	Submit Notify me of followup comments via e-mail

Older News