[thelist] PHP Form Validation

shawn allen shawn at alterior.net
Mon Jan 6 15:57:00 CST 2003


quoth .jeff:
> shawn,
>
> ><><><><><><><><><><><><><><><><><><><><><><><><><><><><><
> > From: shawn allen
> >
> > I'd recommend you not validate email addresses at all unless you can
> > properly account for every possible, *valid* variation according to
> > the RFC[1].
> ><><><><><><><><><><><><><><><><><><><><><><><><><><><><><
>
> oh, you mean this regex?
*snip*
> ouch, now my eyes hurt.
>
> the problem with this regex is it lets common mistakes like luser at aol
> through as a valid address (which it may very well be) when it should
> be stopping the user and telling them it's not a (commonly) valid
> syntax and "add the '.com' you idiot".

Well I'm of the (obviously snarky) opinion that if users can't properly
copy and paste (or even remember) a well-formed email address, they get
what they deserve. My point was that the original regular expression was
restrictive. How about: /^[^@\s]@[\w\.\-]+\.[\w\.]$/ ?

> what percentage of net users *actually* do that though.  i dare say
> the percentage that have a hard time actually inputting their address
> correctly is far greater than the percentage that inputs anything
> other than their basic email address -- i.e., user at domain.com.

I don't see how the commonality of my method has anything to do with it.
It sounds as though this is a business website. Hence, they have the
potential to lose business because of an overly-restrictive regular
expression. As I mentioned in another message, I equate this to
rendering sites inaccessible to certain UA's.

> what's more, the ability to add +foo to your address is a convenience,
> but *not* a requirement to send you email.

No, it's not a *requirement*, it's a preference, and more importantly, a
privilege. A website that denies me the right to either my preferences
or my privileges loses my business. Period. However unlikely it may be,
what happens to folks whose perfectly valid addresses are rejected? I'd
assume they'd react similarly.

> i avoid all that nonsense by using a vanity domain where the catchall
> account points at my chosen pop account.  so, thelist at jeffhowden.com,
> evolt.org at jeffhowden.com, and foo at jeffhowden.com all end up in my
> inbox (or wherever my rules tell them to go at the time), yet i can
> still track where they came from.

You're right, that's probably a better idea :\

> i'm not convinced i should change from a regex that has worked well
> (in coldfusion):
>
> ^([[:alnum:]][-a-zA-Z0-9_%\.]*)?[[:alnum:]]@[[:alnum:]][-a-zA-Z0-9%\>.]*\.[[
> :alpha:]]{2,}$

A lot of people are still convinced they don't need to update their
websites to work in UA's other than IE/windows, either... Seriously
though, if you're running a business, would you risk losing any number
of customers just because you didn't feel like changing your regular
expression? Do you *really* need those %'s in there? :P

Based on yours, here's a much more succinct PCRE the original post's
author could use with preg_match():

/^\w[^@\s]*\w@\w([\w\.]*\w)?\.\w{2,}$/

--
shawn allen
  mailto://shawn@alterior.net
  phone://415.577.3961
  http://alterior.net
  aim://shawnpallen




More information about the thelist mailing list