[thelist] Identify a Web Crawler's request
Feingold Josh S
Josh.S.Feingold at irs.gov
Tue Jul 6 12:34:41 CDT 2004
David -
You will want to check the "User-Agent" string in the HTTP request for
something like *google*.
On a side note, why are you requiring IE6? Unless you have a business need,
I personally don't think it is best practices to limit your user's browser
selection.
Josh
-----Original Message-----
From: thelist-bounces at lists.evolt.org
[mailto:thelist-bounces at lists.evolt.org] On Behalf Of David Travis
Sent: Tuesday, July 06, 2004 8:56 AM
To: thelist at lists.evolt.org
Subject: [thelist] Identify a Web Crawler's request
Hi All,
Interesting question.
I am working on a site, which requires IE6. In order to prevent users who
work with other browsers from accessing the site I wrote some kind of filter
to check the user agent string, and redirect the user to an
upgrade-your-browser page. This redirection also causes requests from
web-crawlers (search engines) to be redirected to this page.
The site contains a lot of content, which I want to be added to the search
engines' indexes.
Now to the question: How do I identify a request from a web-crawler? Is
there a standard header in the HTTP Request to check? I am particularly
interested in Google's headers since it is most popular.
Thanks in advance,
David.
--
* * Please support the community that supports you. * *
http://evolt.org/help_support_evolt/
For unsubscribe and other options, including the Tip Harvester
and archives of thelist go to: http://lists.evolt.org
Workers of the Web, evolt !
More information about the thelist
mailing list