[thelist] Spider Attack

Robert Gormley robert at pennyonthesidewalk.com
Tue Mar 15 23:20:34 CST 2005

BMP wrote:

>I need someone to advise me about preventing robot spiders from using excess bandwidth on my website. I know something about HTML, JS but Perl, etc. are not known. I have a robots.txt file, but I am not sure if the spiders searching my site are friendly or not, so I am hesitating about excluding them until I can get more information about them. The biggest abuser seems to be Mozilla Gecko. Any help would be greatly appreciated.
If you can read the logfiles, see if the user agent/IP combination hits 
/robots.txt as its first GET... A simple method of seeing if it's a) a 
robot, probably well-behaved, or b) a user/ill-behaved robot.


More information about the thelist mailing list