Re: Select all invalid e-mail addresses - Mailing list pgsql-general

From Michael Fuhr
Subject Re: Select all invalid e-mail addresses
Date
Msg-id 20051021201210.GA98037@winnie.fuhr.org
Whole thread Raw
In response to Re: Select all invalid e-mail addresses  (Steve Atkins <steve@blighty.com>)
List pgsql-general
On Fri, Oct 21, 2005 at 11:49:54AM -0700, Steve Atkins wrote:
> SELECT   email  FROM customer
>    WHERE  email !~*
>
'^[^@]*@(?:[^@]*\.)?[a-z0-9_-]+\.(?:a[defgilmnoqrstuwz]|b[abdefghijmnorstvwyz]|c[acdfghiklmnoruvxyz]|d[ejkmoz]|e[ceghrst]|f[ijkmorx]|g[abdefhilmnpqrstuwy]|h[kmnrtu]|i[delnoqrst]|j[mop]|k[eghimnprwyz]|l[abcikrstuvy]|m[acdghklmnopqrstuvwxyz]|n[acefgilopruz]|om|p[aefghklmnrtwy]|qa|r[eouw]|s[abcdeghijklmnortvyz]|t[cdfghjkmnoprtvwz]|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]|edu|com|net|org|gov|mil|info|biz|coop|museum|aero|name|pro|mobi|arpa)$'
>
> ...should be closer. Fixes one typo in the range, uses valid pg format regex, rather
> than perl regex and had a couple of pedant-fixes in the TLDs supported.
>
> It's syntactically correct, and appears to do the right thing on my production
> DB here (which conincedentally has a customer table with an email field :))

The backslashes should be escaped or the regular expression should
be quoted with dollar quotes (8.0 and later) -- otherwise the string
parser converts "\." to ".", which matches anything.  For example,
the above regular expression considers the following address valid:

foo@example?com

Even with that correction the regular expression is still wrong,
especially the ^[^@]*@ part at the beginning.  See this group's
archives and numerous other sources for further discussion on this
topic.

--
Michael Fuhr

pgsql-general by date:

Previous
From: "Guy Rouillier"
Date:
Subject: Handling of pad characters (was RE: Oracle buys Innobase )
Next
From: Tom Lane
Date:
Subject: Re: looking for alternative to MySQL's GROUP_CONCAT function