[thelist] Identify a Web Crawler's request

David Travis dwork at macam.ac.il
Tue Jul 6 15:41:53 CDT 2004


Hi Josh,

Thanks for your reply. Since this is a site which will service a specific
set of users I can allow myself to limit them, and on the other hand offer
them services, which are available for IE6 users only.

Thank you,
David.

-----Original Message-----
From: thelist-bounces at lists.evolt.org
[mailto:thelist-bounces at lists.evolt.org] On Behalf Of Feingold Josh S
Sent: Tuesday, July 06, 2004 7:35 PM
To: 'thelist at lists.evolt.org'
Subject: RE: [thelist] Identify a Web Crawler's request

David -

You will want to check the "User-Agent" string in the HTTP request for
something like *google*. 

On a side note, why are you requiring IE6?  Unless you have a business need,
I personally don't think it is best practices to limit your user's browser
selection.  

Josh



-----Original Message-----
From: thelist-bounces at lists.evolt.org
[mailto:thelist-bounces at lists.evolt.org] On Behalf Of David Travis
Sent: Tuesday, July 06, 2004 8:56 AM
To: thelist at lists.evolt.org
Subject: [thelist] Identify a Web Crawler's request


Hi All,

Interesting question.

I am working on a site, which requires IE6. In order to prevent users who
work with other browsers from accessing the site I wrote some kind of filter
to check the user agent string, and redirect the user to an
upgrade-your-browser page. This redirection also causes requests from
web-crawlers (search engines) to be redirected to this page.

The site contains a lot of content, which I want to be added to the search
engines' indexes.

Now to the question: How do I identify a request from a web-crawler? Is
there a standard header in the HTTP Request to check? I am particularly
interested in Google's headers since it is most popular.

Thanks in advance,
David.


-- 
* * Please support the community that supports you.  * *
http://evolt.org/help_support_evolt/

For unsubscribe and other options, including the Tip Harvester 
and archives of thelist go to: http://lists.evolt.org 
Workers of the Web, evolt ! 
-- 
* * Please support the community that supports you.  * *
http://evolt.org/help_support_evolt/

For unsubscribe and other options, including the Tip Harvester 
and archives of thelist go to: http://lists.evolt.org 
Workers of the Web, evolt ! 



More information about the thelist mailing list