[thelist] spammers/spambots

Nan Harbison nan at nanharbison.com
Mon Jul 27 10:46:02 CDT 2009


For a while, I just watched spam coming from the contact us form on a
website I maintain, and I noticed certain characters are always present in
spam posts, so I do this:

 	  	$findspam1 = strpos($_POST['message'], "[");
	 	$findspam2 = strpos($_POST['message'], "]");
	 	$findspam3 = strpos($_POST['message'], "http://");
		$findspam4 = strpos($_POST['message'], "link=");
	 	if ($findspam1==false && $findspam2==false &&
$findspam3==false && $findspam4==false)
	 	{
			process the form
		}
			otherwise say thank you but do nothing. (I don't
want the spam to know they were rejected, but maybe this doesn't matter?)

This has the potential to kill a legitimate submission I suppose, so I am
thinking about using Barry's suggestion! But NO spam has gotten through, and
all these characters have to be present at the same time in order for the
form to be rejected.

Nan

-----Original Message-----

Although it's generalising to an extent, I believe bots will harvest your
form's details and then just start blind posting common field names and
values to the form's action.

Based on this assumption we've had a fair bit of success with a text field
named 'url' (or something similarly juicy) hidden with CSS, a label of 'Not
for public use' (for people with CSS disabled), and a value of 'blank'. Then
our form processor checks $_POST['url'] is set and has the value of 'blank'.
Anything else is spam or a rather dense form filler who will be displayed
the form again. I can't remember if this was originally suggested here or on
A List Apart, but I've yet to see a spambot get around it.

For what it's worth, I don't think blacklists are useful as they'll always
find a way around them, or you'll spend ages tweaking and tweaking.

Hope that helps.

Barry

-----Original Message-----


Just curious,  I am finishing up a little program, the preprocessor, which
will be used to grab $_POST or $_REQUEST content, and if it meets certain
criteria, reject any further processing. 

So the first question, automated spambots, do they attempt to fill in
content in any/all fields even if the field is bogus/contrived?

And the second question, much of the spam content I see is posted in
non-English dialects, way not English.  If I knew where to start I can
probably include some of this "stuff" in a reject list, but I'm not surehow
to get or convert these odd looking characters into something my forms can
handle.  Suggestions?

-- 
Bob






More information about the thelist mailing list