From bobm at dottedi.biz Mon Jul 27 10:04:57 2009 From: bobm at dottedi.biz (Bob Meetin) Date: Mon, 27 Jul 2009 09:04:57 -0600 Subject: [thelist] spammers/spambots Message-ID: <4A6DC219.2000404@dottedi.biz> Just curious, I am finishing up a little program, the preprocessor, which will be used to grab $_POST or $_REQUEST content, and if it meets certain criteria, reject any further processing. So the first question, automated spambots, do they attempt to fill in content in any/all fields even if the field is bogus/contrived? And the second question, much of the spam content I see is posted in non-English dialects, way not English. If I knew where to start I can probably include some of this "stuff" in a reject list, but I'm not surehow to get or convert these odd looking characters into something my forms can handle. Suggestions? -- Bob From barry at burnthebook.co.uk Mon Jul 27 10:22:11 2009 From: barry at burnthebook.co.uk (Barry Woolgar) Date: Mon, 27 Jul 2009 16:22:11 +0100 Subject: [thelist] spammers/spambots In-Reply-To: <4A6DC219.2000404@dottedi.biz> References: <4A6DC219.2000404@dottedi.biz> Message-ID: <003d01ca0ece$054302d0$0fc90870$@co.uk> Hello Although it's generalising to an extent, I believe bots will harvest your form's details and then just start blind posting common field names and values to the form's action. Based on this assumption we've had a fair bit of success with a text field named 'url' (or something similarly juicy) hidden with CSS, a label of 'Not for public use' (for people with CSS disabled), and a value of 'blank'. Then our form processor checks $_POST['url'] is set and has the value of 'blank'. Anything else is spam or a rather dense form filler who will be displayed the form again. I can't remember if this was originally suggested here or on A List Apart, but I've yet to see a spambot get around it. For what it's worth, I don't think blacklists are useful as they'll always find a way around them, or you'll spend ages tweaking and tweaking. Hope that helps. Barry -----Original Message----- From: thelist-bounces at lists.evolt.org [mailto:thelist-bounces at lists.evolt.org] On Behalf Of Bob Meetin Sent: 27 July 2009 16:05 To: thelist at lists.evolt.org Subject: [thelist] spammers/spambots Just curious, I am finishing up a little program, the preprocessor, which will be used to grab $_POST or $_REQUEST content, and if it meets certain criteria, reject any further processing. So the first question, automated spambots, do they attempt to fill in content in any/all fields even if the field is bogus/contrived? And the second question, much of the spam content I see is posted in non-English dialects, way not English. If I knew where to start I can probably include some of this "stuff" in a reject list, but I'm not surehow to get or convert these odd looking characters into something my forms can handle. Suggestions? -- Bob -- * * Please support the community that supports you. * * http://evolt.org/help_support_evolt/ For unsubscribe and other options, including the Tip Harvester and archives of thelist go to: http://lists.evolt.org Workers of the Web, evolt ! From nan at nanharbison.com Mon Jul 27 10:46:02 2009 From: nan at nanharbison.com (Nan Harbison) Date: Mon, 27 Jul 2009 11:46:02 -0400 Subject: [thelist] spammers/spambots In-Reply-To: <003d01ca0ece$054302d0$0fc90870$@co.uk> References: <4A6DC219.2000404@dottedi.biz> <003d01ca0ece$054302d0$0fc90870$@co.uk> Message-ID: For a while, I just watched spam coming from the contact us form on a website I maintain, and I noticed certain characters are always present in spam posts, so I do this: $findspam1 = strpos($_POST['message'], "["); $findspam2 = strpos($_POST['message'], "]"); $findspam3 = strpos($_POST['message'], "http://"); $findspam4 = strpos($_POST['message'], "link="); if ($findspam1==false && $findspam2==false && $findspam3==false && $findspam4==false) { process the form } otherwise say thank you but do nothing. (I don't want the spam to know they were rejected, but maybe this doesn't matter?) This has the potential to kill a legitimate submission I suppose, so I am thinking about using Barry's suggestion! But NO spam has gotten through, and all these characters have to be present at the same time in order for the form to be rejected. Nan -----Original Message----- Although it's generalising to an extent, I believe bots will harvest your form's details and then just start blind posting common field names and values to the form's action. Based on this assumption we've had a fair bit of success with a text field named 'url' (or something similarly juicy) hidden with CSS, a label of 'Not for public use' (for people with CSS disabled), and a value of 'blank'. Then our form processor checks $_POST['url'] is set and has the value of 'blank'. Anything else is spam or a rather dense form filler who will be displayed the form again. I can't remember if this was originally suggested here or on A List Apart, but I've yet to see a spambot get around it. For what it's worth, I don't think blacklists are useful as they'll always find a way around them, or you'll spend ages tweaking and tweaking. Hope that helps. Barry -----Original Message----- Just curious, I am finishing up a little program, the preprocessor, which will be used to grab $_POST or $_REQUEST content, and if it meets certain criteria, reject any further processing. So the first question, automated spambots, do they attempt to fill in content in any/all fields even if the field is bogus/contrived? And the second question, much of the spam content I see is posted in non-English dialects, way not English. If I knew where to start I can probably include some of this "stuff" in a reject list, but I'm not surehow to get or convert these odd looking characters into something my forms can handle. Suggestions? -- Bob From rjmolesa at consoltec.net Mon Jul 27 12:31:08 2009 From: rjmolesa at consoltec.net (Jon Molesa) Date: Mon, 27 Jul 2009 13:31:08 -0400 Subject: [thelist] spammers/spambots In-Reply-To: <4A6DC219.2000404@dottedi.biz> References: <4A6DC219.2000404@dottedi.biz> Message-ID: <20090727173108.GE18244@jenna.rjmolesa.homelinux.net> *On Mon, Jul 27, 2009 at 09:04:57AM -0600 Bob Meetin wrote: > Date: Mon, 27 Jul 2009 09:04:57 -0600 > From: Bob Meetin > Subject: [thelist] spammers/spambots > To: "thelist at lists.evolt.org" > > Just curious, I am finishing up a little program, the preprocessor, > which will be used to grab $_POST or $_REQUEST content, and if it meets > certain criteria, reject any further processing. > > So the first question, automated spambots, do they attempt to fill in > content in any/all fields even if the field is bogus/contrived? Not sure but it'd be trivial to grab a form, parse out the fields and submit URL and just fill the fields and submit. Over and over. I suspect that is what most bots do. > > And the second question, much of the spam content I see is posted in > non-English dialects, way not English. If I knew where to start I can > probably include some of this "stuff" in a reject list, but I'm not > surehow to get or convert these odd looking characters into something my > forms can handle. Suggestions? > > -- > Bob > > -- > > * * Please support the community that supports you. * * > http://evolt.org/help_support_evolt/ > > For unsubscribe and other options, including the Tip Harvester > and archives of thelist go to: http://lists.evolt.org > Workers of the Web, evolt ! I believe some of the bots grab the form once and then constantly submit to it over and over. You can detect this my issuing a hidden form MD5 value when the form is requested and check that it is present an matches upon submission. You'd save it in the session when the form is requested and verify that the value submitted matches what's stored in the session. It works well, but will only catch those submissions that are direct post without first a get. Should at least filter out bogus posts. Another approach would be to add http://recaptcha.net support to the form. Go here http://recaptcha.net/resources.html for various libraries. The result would be the same but more work for real people where the first one handles it transparently. Assuming that the bots first make a get request before the post then then you have some other things to consider. 1) Banning IP's that makes excessive requests to the specific URL. 2) Validate the data being submitted that it makes sense of what's requested. 3) Sanitize the data if it's headed for a database. Escape the string. 4) Integrate http://akismet.com/ into your application. To catch known and suspected spam automatically. There are several libraries for integrating with apps other than Wordpress. http://akismet.com/development/ 5) Finally, have a human review false positive and negatives to train akismet better. -- Jon Molesa rjmolesa at consoltec.net if you're bored or curious http://rjmolesa.com From moseley at hank.org Mon Jul 27 17:53:03 2009 From: moseley at hank.org (Bill Moseley) Date: Mon, 27 Jul 2009 15:53:03 -0700 Subject: [thelist] "onchange" event on checkbox and YUI Message-ID: <16f65d000907271553v60d3603akaf93c7467562af97@mail.gmail.com> This is the famous IE problem where onchange on a checkbox doesn't fire until you leave the element. I have a checkbox and an associated