[thelist] spammers

Warden, Matt mwarden at odyssey-design.com
Sat Nov 11 12:58:35 CST 2000


from their site:


"The first problem was the potentially bad effects that Wpoison might have on
legitimate web crawlers, such as those used by the major web search engine
companies, and the related secondary negative effects which those primary
negative effects might have on any Wpoison user site which had high hopes of
being well and truly cataloged by the major search engine companies.

This problem was trivially eliminated by including code in Wpoison which
causes each Wpoison-generated randomized web page to carry a clear indication
(for the benefit of legitimate web crawlers) that the page in question should
not be cataloged in any way. Basically, Wpoison now merely makes proper use of
the (pre-existing) Robot Exclusion Protocol. Use of this protocol (and the
associated ``off limits'' markers) within all web pages generated by Wpoison
serves to insure both that (a) legitimate[1] web crawlers will not get all
caught up in repeatedly reading thousands (or millions) of randomized garbage
pages generated by Wpoison and that (b) the legitimate search engine companies
will still be able to successfully add your web site to their data bases.

....

[footnotes]

[1] ... Legitimate web crawlers, such as those used by the major search engine
companies do always obey the standardized and widely accepted Robot Exclusion
Protocol, and they take its use, on any given web page, as a clear and
unambiguous ``keep out'' sign. Spammers who are trawling for e-mail address on
the other hand have no incentive whatsoever to skip any web pages that might
contain valuable fresh e-mail addresses ... In fact it is this reckless
behavior that Wpoison relies upon...

It should be noted however that since the development of the first
publically-released version of Wpoison, spammers have been starting to catch
on to the fact that their own stupidity and greediness in reading all web
pages, even when they have been warned off, was in fact causing them more harm
than good. Because of this the author of Wpoison now believes that many (and
perhaps even a majority) of the spammer's address harvesting web crawlers have
now been reprogrammed so that they now do obey the standard Robot Exclusion
Protocol...

The author of Wpoison nowadays strongly advises (to all who will listen) that
all web pages containing real e-mail addresses should in fact be marked as
being ``off limits'' via the standard Robot Exclusion Protocol, both now and
into the forseeable future."

hth,

--
mattwarden
mattwarden.com

----- Original Message -----
From: Peter Van Dijck <peter at vardus.com>
To: <thelist at lists.evolt.org>
Sent: Saturday, November 11, 2000 10:59 AM
Subject: [thelist] spammers


> http://www.monkeys.com/wpoison/why.html
> anyone ever used this? It generates fake pages full of fake emails for the
> spam harvest robots. I just wonder will it trap search engine robots and if
> so will I be punished for that?
> Peter
>
>
> ---------------------------------------
> For unsubscribe and other options, including
> the Tip Harvester and archive of TheList go to:
> http://lists.evolt.org Workers of the Web, evolt !
>





More information about the thelist mailing list