[thelist] Robots.txt, the robots meta tag, and copyright references needed.
aardvark
roselli at earthlink.net
Mon Jan 14 16:55:28 CST 2002
> From: April <april at farstrider.org>
[...]
> Here is their chain of thought, where they got it I don't know:
> 1. Having a robots.txt prevents people from copying our information
false... robots.txt files are there to guide *good* spiders who pay
attention to them... *anyone* can ignore a robots.txt file, and
everyone who steals a site or harvests email addresses does just
that...
> on our websites. 2. If we don't have a robots.txt disallowing all
> access, we are giving people a legal right to take our information. 3.
false... lack of a robots.txt file in no way supersedes your copyright
over your own content... it has no bearing in copyright dispute...
> Besides that, the robots.txt physically prevents all web spiders from
> accessing our site. 4. We should contact search engines and tell them
false... all spiders can ignore them... it's easy to prove, too... there
are, after all, just text files...
> our keywords... It might take a bit of following up, but that's what
> I'm for. (Gods, I can see that email now... Dear Google...) 5.
huh? that's a new one... IOW, nope, it'll never work... search
engines that rely on spiders will *only* index a site, not an email or
a phone call... ranking is based on more than keywords anyway...
> Since I'm so difficult, they have found a way to add a NOFOLLOW robots
> meta tag to the front page, so search engines can read that... no, we
> can't take down the robots.txt and put robots meta tags on other
> pages.
ok, that meta tag has now prevented the *good* search engines
from indexing the site, but the email spam harvesters and other
things will just ignore it...
where did these guys get their info?
> I don't know how they decided that robots.txt's are a legal issue, but
> I don't think I can convince them otherwise without the name of an
> important person behind it. Can anyone point me to articles which
> -don't- refer to robots.txt as a security measure, and explain why
> not? And if anyone has ever seen anything about legal issues
> involving robots.txt, if such even exist, I would really love those
> links. Also, I'm looking for an article on those email harvesters
> which will use a robots.txt to choose where to index first.
[...]
never even occurred to me to archive this kinda stuff... it's just
common sense... and experience... and reading the docs...
More information about the thelist
mailing list