[thelist] Identify a Web Crawler's request

David Travis dwork at macam.ac.il
Tue Jul 6 15:51:55 CDT 2004


Hi Sarah,

This site is for a specific academic use, and a lot of MS javascripts are
used there, I am afraid that people have to accept it that some users browse
with internet explorer, as well as some developers use MS proprietary
objects in their sites.

Why don't we use other solutions? Well... start with the fact that I am VERY
disappointed from Mozilla's Hebrew (what can I do?) support...

Good day,
David.


-----Original Message-----
From: thelist-bounces at lists.evolt.org
[mailto:thelist-bounces at lists.evolt.org] On Behalf Of Sarah Sweeney
Sent: Tuesday, July 06, 2004 7:37 PM
To: thelist at lists.evolt.org
Subject: Re: [thelist] Identify a Web Crawler's request

David Travis wrote:
> Hi All,
> 
> Interesting question.
> 
> I am working on a site, which requires IE6. In order to prevent users who
> work with other browsers from accessing the site I wrote some kind of
filter
> to check the user agent string, and redirect the user to an
> upgrade-your-browser page. This redirection also causes requests from
> web-crawlers (search engines) to be redirected to this page.
> 
> The site contains a lot of content, which I want to be added to the search
> engines' indexes.
> 
> Now to the question: How do I identify a request from a web-crawler? Is
> there a standard header in the HTTP Request to check? I am particularly
> interested in Google's headers since it is most popular.
> 
> Thanks in advance,
> David.

Your question about identifying search engine spiders is a good one but 
AFAIK there is no cut-and-dry way to identify spiders, as they each have 
a unique user agent string. But I fear you will not be happy with my or 
other replies to your question, as the reason for the question is 
somewhat baffling (and will probably annoy the heck out of lots of 
listers). It begs the question: why is IE6 required for your site? I'm 
guessing it is not a corporate intranet, or you would likely want to 
avoid have it crawled. I'm giving you the benefit of the doubt in hoping 
that perhaps you have a good reason for shutting out all other browsers 
from the site (though I can't think of any good reasons myself), but 
don't be surprised if you get flamed about this :)

Also, out of curiosity: what do non-IE6 visitors to the site see?

-- 
Sarah Sweeney
Web Developer & Programmer
Portfolio :: http://sarah.designshift.com
Blog, etc :: http://hardedge.ca
-- 
* * Please support the community that supports you.  * *
http://evolt.org/help_support_evolt/

For unsubscribe and other options, including the Tip Harvester 
and archives of thelist go to: http://lists.evolt.org 
Workers of the Web, evolt ! 



More information about the thelist mailing list