[thelist] JS regex help

J.J.SOLARI jjsolari at pobox.com
Fri Oct 8 02:19:39 CDT 2004


> Date: Thu, 7 Oct 2004 17:41:46 -0700
> From: Courtenay <court3nay at gmail.com>

> Also remember that people use the plus symbol in their emails to
> filter mail into relevant folders client-side, (this is apparently
> valid syntax) i.e.
> 
>   johnsmith+personal at mail.com
> 
> so your     [a-z_\.] 
> becomes    [a-z_\.+] 
> 
> or even (this is scrappy, someone can do better than this) 
> [a-z_\.][+]?[a-z_\.]?
> 
> and the  (\.[a-z]{2,4})(\.[a-z]{2})*$)   (which checks for .aa.bb at
> the end of the string)
> becomes [a-z]+[.][a-z]+  or [a-z]{2,1024}  (assuming sometime in the
> future there will be reeeealy long top-levels.)
> 
> If you thought the discussion wasn't pedantic enough, remember also
> that people may submit an email at an IP address.  Or an IPv6
> Address?? if  your toaster/fridge can receive email, in 2008.
> 
> maryjane at 212.25.35.66

According to RFC2822, much more characters are allowed in the local
part of an email address (ie. local_part at domain_part) ; in my
understanding, these characters are: !#$%&'*+-~/=?^_`|~0-9a-z

So here is the regex I use (warning: there is a hardwrap after @):

/^[!#$%&'*+-~/=?^_`|0-9a-z]+(\.[!#$%&'*+-~/=?^_`|0-9a-z]+)*@
[0-9a-z]+([\._-][0-9a-z]+)*\.[a-z]{2,6}$

(Though this should be correct, there seems to be a problem with the
single quote (') with some JavaScript engines, so you may have to drop
this character for the regex to work. Somebody, any clue about that?)

Still this regex is far from exhaustive as it doesn't take into
account domain names only made up of digits, and those (yet unseen)
address between quotes like:

"other characters <even dot / and space: .>"@example.org

And, as strange as it seems, this would be a valid address.

hih,

JJS



More information about the thelist mailing list