[thelist] robots.txt Fwd: 404 recorded at WorldFolk.com

Joshua Olson joshua at waetech.com
Mon Feb 2 07:23:13 CST 2004

> -----Original Message-----
> From: John C Bullas (soton.ac.uk relay)
> Sent: Monday, February 02, 2004 7:45 AM
> Q1: Does robots.txt provide a list of links to follow where they do not
> explicitly exist in the html on the site
> or just allow or disallow directories or file extensions?


No, it just explicitly SUGGESTS which folders and directories not to enter.
Robots may or may not follow this suggestion.

> Q2: are there any advantages in having robots.txt when all my
> pages can be found by following hyperlinks

If you'd prefer to not have robots index an area (perhaps it's cyclic or has
a massive number of pages because of its dynamic nature) then you put an
entry in the robots.txt file.

Side Note: I had a client once who wanted to keep spammers off their email
addresses (on the "contact us" page) by using the robots.txt file only.  I
made numerous other suggestions, but the end result was that I did just what
the client wanted.  That was a year ago.  Up until MyDoom hit they didn't
get any spam on the accounts.  Considering that unscrupulous spambots would
most likely NOT follow the commands in robots.txt, this is remarkable, IMO.

Joshua Olson
Web Application Engineer
WAE Tech Inc.

More information about the thelist mailing list