[thelist] Identify a Web Crawler's request
Manuel González Noriega
manuel at simplelogica.net
Tue Jul 6 09:57:47 CDT 2004
El mar, 06-07-2004 a las 12:56, David Travis escribió:
> Hi All,
>
> Interesting question.
>
> I am working on a site, which requires IE6. In order to prevent users who
> work with other browsers from accessing the site I wrote some kind of filter
> to check the user agent string, and redirect the user to an
> upgrade-your-browser page. This redirection also causes requests from
> web-crawlers (search engines) to be redirected to this page.
Just a little correction, if i'd visit your site with Firefox 0.9, i'll
be redirected to a downgrade-your-browser page :-)
>
> Now to the question: How do I identify a request from a web-crawler? Is
> there a standard header in the HTTP Request to check? I am particularly
> interested in Google's headers since it is most popular.
This will help
http://www.robotstxt.org/wc/active/html/googlebot.html
--
Manuel trabaja para Simplelógica, construcción web
(+34) 985 22 12 65 http://simplelogica.net
escribe en Logicola http://simplelogica.net/logicola/
More information about the thelist
mailing list