Thread: Email address column verification for address list
Email address field email has type character(200) It can contain multiple e-mail addresses separated by comma. Applying validation code below removes and does not allow comma separated address list. How to modify this so that comma separated address list is allowed ? How to simplify/improve this validation, probably some newer root domain names are not allowed by this validation ? Andrus. update klient set email=null where email='' OR (trim(trailing from email) !~* '^[^@]*@(?:[^@]*\\.)?[a-z0-9_-]+\\.(?:a[defgilmnoqrstuwz]|b[abdefghijmnorstvwyz]|c[acdfghiklmnoruvxyz]|d[ejkmoz]|e[ceghrst]|f[ijkmorx]|g[abdefhilmnpqrstuwy]|h[kmnrtu]|i[delnoqrst]|j[mop]|k[eghimnprwyz]|l[abcikrstuvy]|m[acdghklmnopqrstuvwxyz]|n[acefgilopruz]|om|p[aefghklmnrtwy]|qa|r[eouw]|s[abcdeghijklmnortvyz]|t[cdfghjkmnoprtvwz]|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]|edu|com|net|org|gov|mil|info|biz|coop|museum|aero|name|pro|mobi|arpa|eu)$');alter tableklient add check (trim(trailing from email) ~*'^[^@]*@(?:[^@]*\\.)?[a-z0-9_-]+\\.(?:a[defgilmnoqrstuwz]|b[abdefghijmnorstvwyz]|c[acdfghiklmnoruvxyz]|d[ejkmoz]|e[ceghrst]|f[ijkmorx]|g[abdefhilmnpqrstuwy]|h[kmnrtu]|i[delnoqrst]|j[mop]|k[eghimnprwyz]|l[abcikrstuvy]|m[acdghklmnopqrstuvwxyz]|n[acefgilopruz]|om|p[aefghklmnrtwy]|qa|r[eouw]|s[abcdeghijklmnortvyz]|t[cdfghjkmnoprtvwz]|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]|edu|com|net|org|gov|mil|info|biz|coop|museum|aero|name|pro|mobi|arpa|eu)$');
however you are going to validate it, create yourself a domain for it (custom type). That way, if it changes, you have to only update it in one place, instead of doing it on column by column basis.
2010/4/13 Andrus <kobruleht2@hot.ee>: > Email address field email has type character(200) > It can contain multiple e-mail addresses separated by comma. > Applying validation code below removes and does not allow comma separated > address list. > > How to modify this so that comma separated address list is allowed ? > How to simplify/improve this validation, probably some newer root domain > names are not allowed by this validation ? > > Andrus. Why don't you just separate them into individual fields (perhaps in a separate table to have arbitrary many addresses)? What you're doing violates 1NF - fields should be atomic (i.e. in their simplest form, so you never have to parse values from them). Here's a reasonable email address domain. Note that it is AS text, not AS character(200). It does not attempt to match the TLD to a list of known TLDs, nor should it (that's very probably impractical, at least with regex): CREATE DOMAIN email_address AS text CONSTRAINT email_address_check CHECK ((VALUE ~ '^(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\\-+)|([A-Za-z0-9]+\\.+)|([A-Za-z0-9]+\\++))*[A-Za-z0-9]+@((\\w+\\-+)|(\\w+\\.))*\\w{1,63}\\.[a-zA-Z]{2,6}$'::text)); If you absolutely must put an arbitrary number of e-mail addresses in one field, I suggest you use the domain in an array. Unfortunately, arrays of domains are not directly supported, last I checked. Could someone weigh in on how to roll this domain into a custom email_address_array domain? Regards, Peter Geoghegan
Peter, thank you. > Why don't you just separate them into individual fields (perhaps in a separate table to have arbitrary many addresses)? What you're doing violates 1NF - fields should be atomic (i.e. in their simplest form, so you never have to parse values from them). This is existing database deployed to many sites and used by many programs. Re-factoring db and software to add this minor feature seems to be not reasonable. For 99% of cases field contains only single address. > Here's a reasonable email address domain. Note that it is AS text, not >AS character(200). It does not attempt to match the TLD to a list of >known TLDs, nor should it (that's very probably impractical, at least >with regex): >CREATE DOMAIN email_address > AS text > CONSTRAINT email_address_check CHECK ((VALUE ~ >'^(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\\-+)|([A-Za-z0-9]+\\.+)|([A-Za-z0-9]+\\++))*[A-Za-z0-9]+@((\\w+\\-+)|(\\w+\\.))*\\w{1,63}\\.[a-zA-Z]{2,6}$'::text)); >If you absolutely must put an arbitrary number of e-mail addresses in one field, I suggest you use the domain in an array. Unfortunately, arrays of domains are not directly supported, last I checked. I don't know how to use arrrays to solve this. How to create constraint for char(200) column for comma separated email list ? Andrus.
Re: Email address column verification foraddress list
From
merlyn@stonehenge.com (Randal L. Schwartz)
Date:
>>>>> "Andrus" == Andrus <kobruleht2@hot.ee> writes: Andrus> This is existing database deployed to many sites and used by many programs. Andrus> Re-factoring db and software to add this minor feature seems to be not Andrus> reasonable. For 99% of cases field contains only single address. So for most applications written against this database, they're probably assuming only one email address in this column. And then you confuse the issue by putting two or more comma-separated addresses, which are not universally usable when a single address is provided. I sense this code will end up on thedailywtf.com[1] when you leave and your successor discovers what you insanely tried to do. [1] which should be mandatory reading for *all* devs, with the goal of "never let my code end up here" -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
> So for most applications written against this database, they're probably > assuming only one email address in this column. This column content is transparent to application. It is passed to e-mail sender (blat.dll) without any processing. > And then you confuse the issue by putting two or more comma-separated > addresses, which are not universally usable when a single address > is provided. Application code does not process this column content. It passes it to blat.dll which can process list. Andrus.
> So for most applications written against this database, they're probably > assuming only one email address in this column. > > And then you confuse the issue by putting two or more comma-separated > addresses, which are not universally usable when a single address > is provided. Surely not Randal. Client application developers can simply read the regex in the check constraint to see its behaviour :-) Regards, Peter Geoghegan