Re: Email data type - Mailing list pgsql-hackers

From Steve Crawford
Subject Re: Email data type
Date
Msg-id 200405170921.54981.scrawford@pinpointresearch.com
Whole thread Raw
In response to Re: Email data type  (Steve Atkins <steve@blighty.com>)
Responses Re: Email data type  (Steve Atkins <steve@blighty.com>)
List pgsql-hackers
On Monday 17 May 2004 8:45 am, Steve Atkins wrote:
> Also, a@10.11.12.13 is a syntactically valid email address, in the
> .13 TLD. It does not deliver to 10.11.12.13, or anywhere else, as
> of today, unless the MTA or local recursive resolver is broken (a
> common case). a@[10.11.12.13] is a whole other thing. As is
> a@[::10.11.12.13] and various other IPv6 variants.

Along those lines a@foo.bar.com and a@[foo.bar.com] are valid but they 
don't necessarily refer to the same mailbox (depends on the mx for 
foo.bar.com).

> Parsing email addresses is a significant part of my day job, and
> email address validation is a lot harder than it looks at first
> sight.

Yes, indeed.

> Don't forget quoting, whitespace, escaping and nesting
> parenthetical comments....

The "just looking for an @" breaks pretty quickly, say with 
"joe@"some.dom or a myriad of other variations.

> In some contexts the empty string is a valid email address. In some
> contexts "Postmaster" is a valid email address.

As are postmaster and pOsTmaSTeR and POSTmaster and they are all the 
same address.

> I'm not entirely convinced that an email address is a simple and
> well-defined enough datatype to handle comprehensively within the
> DB. The validation decisions are complex and vary from application
> to application.
>
> (I use two text columns - localpart and domainpart, with an index
>  on reverse(lower(domainpart)) and leave validation to the
> application, myself).

Indeed. A problem with the "email address" datatype is that it hinders 
normalization:

"Joe User" <joe@user.dom> is valid but a database designer would 
probably prefer columns for name and email, or if the addresses were 
all people, firstname, middlename, lastname, email.

As you mentioned, the email can be broken into localpart and 
domainpart but if the app requires it, the domainpart could be 
further rendered into toplevel (so you could find all the .gov or 
.edu), secondlevel (at least corresponds to a registrant) and 
subdomain(s) as necessary.

Cheers,
Steve



pgsql-hackers by date:

Previous
From: "Dave Page"
Date:
Subject: Re: Email data type
Next
From: Fabien COELHO
Date:
Subject: Re: add server include files to default installation?