Re: Email data type - Mailing list pgsql-hackers

From Steve Atkins
Subject Re: Email data type
Date
Msg-id 20040517154547.GA2400@gp.word-to-the-wise.com
Whole thread Raw
In response to Re: Email data type  (Gaetano Mendola <mendola@bigfoot.com>)
Responses Re: Email data type  (Steve Crawford <scrawford@pinpointresearch.com>)
List pgsql-hackers
On Mon, May 17, 2004 at 05:01:36PM +0200, Gaetano Mendola wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Bruno Wolff III wrote:
> 
> | On Sun, May 16, 2004 at 04:36:55 +0200,
> |   Gaetano Mendola <mendola@bigfoot.com> wrote:
> |
> |>The type is indexable and provide also conversion methods:
> |>
> |>text <--> email
> |>and the operator >>, is possible use it in select like:
> |
> |
> | When you are converting between text and email data, what format are the
> | text version of the address going to be? For example you might be using
> | rfc2821 encoding, rfc2822 encoding or concatenating the local part,
> | an @ sign and the domain name. Don't forget about domain literals.
> 
> Actually I use <local_part>@<domain_name>
> 
> Also the validator will validate emails in this form, if you are thinking to
> validate emails as:
> 
> "Gaetano M. Public"(junior)<gmendola@(new account)bigfoot.com>
> 
> that are perfectly valid I think that it's a valid option to consider.
> 
> About the domain literals, I think to validate it in the near future,
> rejecting private subnet according to this list:
> 
> 10.0.0.0 - 10.255.255.255
> 172.16.0.0 - 172.31.255.255
> 192.168.0.0 - 192.168.255.255
> 169.254.0.0 -169.254.255.255

Bad idea. If I'm testing I'm likely to be testing with addresses on my
local network. My local network is in 10/8. The middleware or the
application should be making those decisions. But if you are going to
filter on IP space, also consider class D & E space.

Also, a@10.11.12.13 is a syntactically valid email address, in the .13
TLD. It does not deliver to 10.11.12.13, or anywhere else, as of
today, unless the MTA or local recursive resolver is broken (a common
case). a@[10.11.12.13] is a whole other thing. As is a@[::10.11.12.13]
and various other IPv6 variants.

a@foo.bar is syntactically valid. a@foo.invalid is syntactically valid, but
should be immediately rejected. a@example.com is valid, even if known to
be non-deliverable.

> I think I have to discard also the addresses with last octet equal to 256.

Or equal to 4872014, come to that. Any reason you're looking at 256
in particular? If you mean .255 (or .0) then don't fall into that trap -
there are perfectly valid, routable addresses ending in both .0 and
.255, despite what some folks in Redmond would have you believe.

> Any comments ?

Parsing email addresses is a significant part of my day job, and email
address validation is a lot harder than it looks at first sight.

Don't forget quoting, whitespace, escaping and nesting parenthetical
comments. Also, remember that A@b.com and a@b.com are different email
addresses, while a@b.com and a@B.com are the same email
address. POSTMASTER@b.com and postmaster@b.com are the same email
address. ABUse@b.com and abuse@b.com may be the same address or
different email addresses, depending on which religious faction you
belong to.

In some contexts the empty string is a valid email address. In some
contexts "Postmaster" is a valid email address.

Also, one persons definition of a valid email address will be very
different from another persons definition of such. Many of those
definitions require some DNS resolution to make the decision.

I'm not entirely convinced that an email address is a simple and
well-defined enough datatype to handle comprehensively within the
DB. The validation decisions are complex and vary from application to
application.

(I use two text columns - localpart and domainpart, with an indexon reverse(lower(domainpart)) and leave validation to
theapplication,myself).
 

Cheers, Steve



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Call for 7.5 feature completion
Next
From: Fabien COELHO
Date:
Subject: Re: Bogus permissions display in 7.4