[thelist] spiders spoofing MSIE
Darren Beale
bealers_lists at exponetic.com
Wed Apr 14 09:04:39 CDT 2004
Hi
I'm working on a project where one sub task is to filter out automated
agents by their USER_AGENT string. I've done a fair bit of research and
have come up with a few good lists of bot USER_AGENT strings but I want
to ensure that the data is as accurate as possible. I particularly want
to make sure that I'm not missing any agents that pretend to be MSIE as
these are much harder to pick up using regexp's.
The following is what I've been able to glean from Googling, does anyone
know of any more?
-----------------8<-----------------
Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; Girafabot; girafabot at
girafa dot com; http://www.girafa.com)
Mozilla/4.0 (compatible; MSIE 5.0; Windows 95) VoilaBot BETA 1.2
(http://www.voila.com/)
Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; Girafabot; girafabot at
girafa dot com; http://www.girafa.com)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0; Girafabot; girafabot
at girafa dot com; http://www.girafa.com)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; ODP links test;
http://tuezilla.de/test-odp-links-agent.html)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Link Checker Pro
3.1.52, http://www.Link-Checker-Pro.com)
Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) WebWasher 3.3
Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) / StripIt 0.4
Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) RPT-HTTPClient/0.3-3E
Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt; Find LinkChecker
Web Crawler Spider Gatherer)
Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt; DTS Agent
Mozilla/4.0 (compatible; MSIE 5.5; AOL 7.0; Windows 95; sureseeker.com)
----------------->8-----------------
many thanks in advance
Darren Beale
More information about the thelist
mailing list