Thread: Looking for types: phone number, email addresses
Lazy programmer is looking for implementation of the following datatypes: * phone (or fax) numbers (with format checking) like +33 1 39 30 83 46 * email addresses (with format checking and ideally a test that the MX record does exist) I've looked in the following places: * contrib/ (isbn_issn is a very useful starting point, if I need to do it myself) * techdocs.postgresql.org (the datatypes section of the cookbook is empty) * FAQ There is "5.2) How can I contribute some nifty new types and functions to PostgreSQL?" but not the reverse :-)
On Fri, 2002-07-19 at 12:52, Stephane Bortzmeyer wrote: > Lazy programmer is looking for implementation of the following > datatypes: > > * phone (or fax) numbers (with format checking) like +33 1 39 30 83 46 CHECK (phone ~ '^\\+33 [0-9]( [0-9]{2}){4}$') That's only good for French numbers, since only France uses that grouping of digits for phone numbers. (Which is why +33 is explicit.) -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight, UK http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C ======================================== "Blessed is the man that trusteth in the LORD, and whose hope the LORD is." Jeremiah 17:7
On Fri, Jul 19, 2002 at 02:47:43PM +0100, Oliver Elphick <olly@lfix.co.uk> wrote a message of 25 lines which said: > CHECK (phone ~ '^\\+33 [0-9]( [0-9]{2}){4}$') > > That's only good for French numbers, since only France uses that > grouping of digits for phone numbers. OK, let's check: CHECK (phone ~ '^\\+[0-9]+[ 0-9]+$') which is, I believe, international (although the above regexp does not prevent strange things like two consecutive spaces).
Stephane Bortzmeyer writes: > Oliver Elphick <olly@lfix.co.uk> wrote > > CHECK (phone ~ '^\\+33 [0-9]( [0-9]{2}){4}$') > > That's only good for French numbers > OK, let's check: > CHECK (phone ~ '^\\+[0-9]+[ 0-9]+$') > which is, I believe, international (although the above regexp does not > prevent strange things like two consecutive spaces). Still a loosing battle and IMO a needless constraint... For example I'd normally write my work number as '+44 (0)131 557 5595' which the above regex wouldn't grok... And it's also common in several counties for people to use '.' and '-' as separators! Is the field going to be used by humans (in which case they can sort out minor formatting issues on the fly) or for automated dialing (in which case you'd probably want to strip out everything bar numerics)? Lee.
On Fri, 2002-07-19 at 15:07, Stephane Bortzmeyer wrote: > OK, let's check: > > CHECK (phone ~ '^\\+[0-9]+[ 0-9]+$') > > which is, I believe, international (although the above regexp does not > prevent strange things like two consecutive spaces). That would match "+0 ", which is probably not a valid number. You can combine several patterns as alternatives in the CHECK constraint or even put alternatives in the pattern. -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight, UK http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C ======================================== "Blessed is the man that trusteth in the LORD, and whose hope the LORD is." Jeremiah 17:7
On Fri, Jul 19, 2002 at 03:27:19PM +0100, Lee Kindness <lkindness@csl.co.uk> wrote a message of 20 lines which said: > Is the field going to be used by humans (in which case they can sort > out minor formatting issues on the fly) or for automated dialing > (in which case you'd probably want to strip out everything bar > numerics)? Mostly by humans (I don't foresee automatic use) but the idea was to catch mispellings (such as typing O "o" instead of 0 "zero"). Email addresses, on the other hand, *will* be used automatically (opt-in only, do not worry). At the present time, I use: -- Not perfectly RFC 2822-compliant :-( email TEXT NOT NULL CHECK (email ~ '^[a-zA-Z0-9\+\.\_\-]+@[a-zA-Z0-9\.\-]+$'), phone TEXT CHECK (phone ~ '^\\+[0-9]+[ 0-9]+$'), fax TEXT CHECK (fax ~ '^\\+[0-9]+[ 0-9]+$'),
Remember to allow quotes and spaces as well. This is a valid address (no, not that I actually defined it as an alias): "Josh Jore"@greentechnologist.org Joshua b. Jore ; http://www.greentechnologist.org On Fri, 19 Jul 2002, Stephane Bortzmeyer wrote: > On Fri, Jul 19, 2002 at 03:27:19PM +0100, > Lee Kindness <lkindness@csl.co.uk> wrote > a message of 20 lines which said: > > Email addresses, on the other hand, *will* be used automatically > (opt-in only, do not worry). > > At the present time, I use: > > -- Not perfectly RFC 2822-compliant :-( > email TEXT NOT NULL CHECK (email ~ '^[a-zA-Z0-9\+\.\_\-]+@[a-zA-Z0-9\.\-]+$'), > phone TEXT CHECK (phone ~ '^\\+[0-9]+[ 0-9]+$'), > fax TEXT CHECK (fax ~ '^\\+[0-9]+[ 0-9]+$'), > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
On Fri, Jul 19, 2002 at 03:40:36PM +0100, Oliver Elphick <olly@lfix.co.uk> wrote a message of 23 lines which said: > That would match "+0 ", which is probably not a valid number. With integrity constraints, you have to know when you stop :-) Not everything can be declared in SQL (I'm still searching for a way to have a trigger fire at the end of a *transaction*, in order to check cross-table integrity), so no check is perfect. I just wanted to catch the majority of the mistakes that people can do when they type phone numbers, not all of them. Many thanks to all for the advices (and to Oliver for the Debian package since this is the one I use).
Stephane Bortzmeyer wrote: > > On Fri, Jul 19, 2002 at 02:47:43PM +0100, > Oliver Elphick <olly@lfix.co.uk> wrote > a message of 25 lines which said: > > > CHECK (phone ~ '^\\+33 [0-9]( [0-9]{2}){4}$') > > > > That's only good for French numbers, since only France uses that > > grouping of digits for phone numbers. > > OK, let's check: > > CHECK (phone ~ '^\\+[0-9]+[ 0-9]+$') Many phone systems here in the US use a central phone number and then touch tone dialing for the extension you want to reach. Written it looks like +1 234 567-8901 x234 where the extension can have any number of digits (depends on the phone system, usually 3 or 4). Also since nearly all phones here have letters, it is very common to give you easy to remember phone numbers like 1-800-UCALLME which in reality is 1-800-822-5563. I don't think that restricting it one way or the other is a good idea at all. It doesn't prevent from entering the wrong number anyway, so what good is it? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Fri, Jul 19, 2002 at 10:11:43AM -0500, Josh Jore <josh@greentechnologist.org> wrote a message of 28 lines which said: > Remember to allow quotes and spaces as well. This is a valid address (no, > not that I actually defined it as an alias): > > "Josh Jore"@greentechnologist.org Yes, but to parse such addresses (not just to allow blindly spaces and quotes in any order), regexps are not sufficient (hence my initial request for a type, with a real parser). Checking that an email address is actually RFC 2822 is far from a trivial task...
On Fri, Jul 19, 2002 at 10:11:43 -0500, Josh Jore <josh@greentechnologist.org> wrote: > Remember to allow quotes and spaces as well. This is a valid address (no, > not that I actually defined it as an alias): > > "Josh Jore"@greentechnologist.org You may not want to store rfc 821 encodings of addresses. Josh Jore@greentechnologist.org is a valid email address. It just needs to be encoded when used in an smtp transaction. Some MUAs don't have a way to inject messages with unencoded email addresses. Others (such as qmail) do. Constraints on unencoded addresses are going to be a lot simpler than those on rfc 821 encoded addresses.