[thelist] Preventing spam spiders from reading mailto links.

Daniel Schlyder daniel at bitblaze.com
Mon Dec 8 18:13:48 CST 2003


[07.12.2003 06:13:38] Philip Weller:
> Recently, while admiring the HTML on a site I was visiting (as Web geeks
> often do) I noticed that they were hiding their e-mail address from spam
> bots by rendering it in escaped character entities.  I've since implemented
> this technique in my just launched personal site (www.philipweller.com).

> I was wondering if those of you much more experienced than myself might have
> some real-world feedback as to the effectiveness of this technique.  Are
> there any techniques you use and can recommend?  (I'm aware of Daniel
> Benjamin's Hiveware Enkoder (http://hiveware.com/enkoder_form.php) and think
> he's providing a wonderful service, but I've chosen not to use it because it
> relies on JavaScript and won't work for users who've disabled it.)

Well, I won't say that I'm experienced, but I tested five email harvesters
earlier this year. I'm pretty sure there are better ones out there, but these
are the ones I found easily when googling and searching a few download indexes.

1) Advanced Email Extractor Pro (http://www.mailutilities.com/aee/)
2) AtomPark Email Hunter 1.51 (http://www.massmailsoftware.com/extractweb/)
3) E-Mail Seeker 3.0 (http://www.e-mailseeker.com/)
4) Fast Email Extractor 4.3 (http://www.lencom.com/FEE.html)
5) Web Data Extractor 3.62 (http://www.webextractor.com/)

Here's my results:

                                                    program:
                                                    1   2   3   4   5
method:
in plain text                                       Y   Y           Y
with <span>s around @                               Y
with @ as &#64;                                     Y
in mailto link                                      Y   Y   Y   Y   Y
in mailto link created through PHP redirection      Y
in mailto link created by JavaScript
in mailto link created by form processing page


As you can see from these results, Advanced Email Extractor Pro may be able to
harvest your address, then again, perhaps it just understands the entity for
'@'. If you test this, please let me know of your results.

If you don't want to use JavaScript, you might want to use the last method I
tested. Then again, perhaps not. I think it's extremely ugly. :) Anyways,
here's the code, since I'm too tired to explain.


In your page:

<p><form method="post" action="mailto.php" style="display: inline">
   <input type="hidden" name="u" value="username" />
   <input type="hidden" name="d" value="domain" />
   <input type="image"
      src="email-image.php?u=username&amp;d=domain&amp;font=4"
      alt="username at domain"
      style="vertical-align: middle" />
</form></p>


mailto.php:

<?php

if (isset($_POST['u']))
{
    header('Location: mailto:' . $_POST['u'] . '@' . $_POST['d']);
}

?>


email-image.php:

<?php

$address = $_GET['u'] . '@' . $_GET['d'];
$width = imagefontwidth($_GET['font']) * strlen($address);
$height = imagefontheight($_GET['font']);

$im = imagecreate($width, $height);
$white = imagecolorallocate($im, 255, 255, 255);
$black = imagecolorallocate($im, 0, 0, 0);

imagefill($im, 0, 0, $white);
imagestring($im, $_GET['font'], 0, 0, $address, $black);

header("Content-type: image/png");
imagepng($im);

imagedestroy($im);

?>


Personally, I plan to write e-mail addresses like this on my new site:

<span id="mailto">foo at bar.baz</span>

and then replace them with proper mailto links using JavaScript to manipulate
DOM on page load.

Regards,
Daniel Schlyder



More information about the thelist mailing list