[thelist] Spider Attack

Robert Gormley robert at pennyonthesidewalk.com
Tue Mar 15 23:20:34 CST 2005


BMP wrote:

>Hi
> 
>I need someone to advise me about preventing robot spiders from using excess bandwidth on my website. I know something about HTML, JS but Perl, etc. are not known. I have a robots.txt file, but I am not sure if the spiders searching my site are friendly or not, so I am hesitating about excluding them until I can get more information about them. The biggest abuser seems to be Mozilla Gecko. Any help would be greatly appreciated.
>
If you can read the logfiles, see if the user agent/IP combination hits 
/robots.txt as its first GET... A simple method of seeing if it's a) a 
robot, probably well-behaved, or b) a user/ill-behaved robot.

Robert



More information about the thelist mailing list