[thelist] Identify a Web Crawler's request
J Nicholas Tolson
jtnt at mindspring.com
Tue Jul 6 12:39:09 CDT 2004
On 7/6/04 8:55 AM, "David Travis" <dwork at macam.ac.il> wrote:
> Now to the question: How do I identify a request from a web-crawler? Is
> there a standard header in the HTTP Request to check? I am particularly
> interested in Google's headers since it is most popular.
>
> Thanks in advance,
> David.
>
Sorry for the previous incomplete and incorrectly formatted post, fingers
hit the wrong key combo.
It seems that the Googlebot user-agent string may be in flux, but looking
for the text "Googlebot" in the string should single out their bot.
See here for more info on the topic:
http://www.markcarey.com/googleguy-says/archives/googlebot-useragent-change.
html
Nicholas
More information about the thelist
mailing list