[thelist] Spider Attack
Robert Gormley
robert at pennyonthesidewalk.com
Tue Mar 15 23:20:34 CST 2005
BMP wrote:
>Hi
>
>I need someone to advise me about preventing robot spiders from using excess bandwidth on my website. I know something about HTML, JS but Perl, etc. are not known. I have a robots.txt file, but I am not sure if the spiders searching my site are friendly or not, so I am hesitating about excluding them until I can get more information about them. The biggest abuser seems to be Mozilla Gecko. Any help would be greatly appreciated.
>
If you can read the logfiles, see if the user agent/IP combination hits
/robots.txt as its first GET... A simple method of seeing if it's a) a
robot, probably well-behaved, or b) a user/ill-behaved robot.
Robert
More information about the thelist
mailing list