[thelist] Identify a Web Crawler's request

Manuel González Noriega manuel at simplelogica.net
Tue Jul 6 09:57:47 CDT 2004


El mar, 06-07-2004 a las 12:56, David Travis escribió:
> Hi All,
> 
> Interesting question.
> 
> I am working on a site, which requires IE6. In order to prevent users who
> work with other browsers from accessing the site I wrote some kind of filter
> to check the user agent string, and redirect the user to an
> upgrade-your-browser page. This redirection also causes requests from
> web-crawlers (search engines) to be redirected to this page.

Just a little correction, if i'd visit your site with Firefox 0.9, i'll
be redirected to a downgrade-your-browser page :-) 

> 
> Now to the question: How do I identify a request from a web-crawler? Is
> there a standard header in the HTTP Request to check? I am particularly
> interested in Google's headers since it is most popular.

This will help

http://www.robotstxt.org/wc/active/html/googlebot.html

-- 
   Manuel trabaja para Simplelógica, construcción web
(+34) 985 22 12 65            http://simplelogica.net
escribe en Logicola http://simplelogica.net/logicola/    



More information about the thelist mailing list