On Fri, Oct 21, 2005 at 11:49:54AM -0700, Steve Atkins wrote:
> SELECT email FROM customer
> WHERE email !~*
>
'^[^@]*@(?:[^@]*\.)?[a-z0-9_-]+\.(?:a[defgilmnoqrstuwz]|b[abdefghijmnorstvwyz]|c[acdfghiklmnoruvxyz]|d[ejkmoz]|e[ceghrst]|f[ijkmorx]|g[abdefhilmnpqrstuwy]|h[kmnrtu]|i[delnoqrst]|j[mop]|k[eghimnprwyz]|l[abcikrstuvy]|m[acdghklmnopqrstuvwxyz]|n[acefgilopruz]|om|p[aefghklmnrtwy]|qa|r[eouw]|s[abcdeghijklmnortvyz]|t[cdfghjkmnoprtvwz]|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]|edu|com|net|org|gov|mil|info|biz|coop|museum|aero|name|pro|mobi|arpa)$'
>
> ...should be closer. Fixes one typo in the range, uses valid pg format regex, rather
> than perl regex and had a couple of pedant-fixes in the TLDs supported.
>
> It's syntactically correct, and appears to do the right thing on my production
> DB here (which conincedentally has a customer table with an email field :))
The backslashes should be escaped or the regular expression should
be quoted with dollar quotes (8.0 and later) -- otherwise the string
parser converts "\." to ".", which matches anything. For example,
the above regular expression considers the following address valid:
foo@example?com
Even with that correction the regular expression is still wrong,
especially the ^[^@]*@ part at the beginning. See this group's
archives and numerous other sources for further discussion on this
topic.
--
Michael Fuhr