[thelist] Re: Identify a Web Crawler's request

David Travis dwork at macam.ac.il
Wed Jul 7 02:53:35 CDT 2004

Hi Michael,

Thanks for your reply. In my case it is simply useless to browse the site
unless you meet the minimum requirements, since it relies heavily on the
client's browser.


-----Original Message-----
From: thelist-bounces at lists.evolt.org
[mailto:thelist-bounces at lists.evolt.org] On Behalf Of Michael Harrington
Sent: Wednesday, July 07, 2004 7:57 AM
To: thelist at lists.evolt.org
Subject: [thelist] Re: Identify a Web Crawler's request

What about putting <meta NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> on the
browser upgrade page?

This works for me with all the search engines. I am using a browser
detection for IE below 5.0 and Netscape below 6.0. If they don't meet this,
they get a page telling them they should upgrade their browser. But I do
allow them to go ahead and try to view my site with the browser they are

thelist-request at lists.evolt.org wrote:

> Date: Tue, 6 Jul 2004 14:55:40 +0200
> From: "David Travis" <dwork at macam.ac.il>
> To: <thelist at lists.evolt.org>
> Subject: [thelist] Identify a Web Crawler's request
> Message: 3
> Hi All,
> Interesting question.
> I am working on a site, which requires IE6. In order to prevent users who
> work with other browsers from accessing the site I wrote some kind of
> to check the user agent string, and redirect the user to an
> upgrade-your-browser page. This redirection also causes requests from
> web-crawlers (search engines) to be redirected to this page.
> The site contains a lot of content, which I want to be added to the search
> engines' indexes.
> Now to the question: How do I identify a request from a web-crawler? Is
> there a standard header in the HTTP Request to check? I am particularly
> interested in Google's headers since it is most popular.
> Thanks in advance,
> David.
> ------------------------------

* * Please support the community that supports you.  * *

For unsubscribe and other options, including the Tip Harvester 
and archives of thelist go to: http://lists.evolt.org 
Workers of the Web, evolt ! 

More information about the thelist mailing list