Re: Case Insensitive Data Type - Mailing list pgsql-general
From | Darren Ferguson |
---|---|
Subject | Re: Case Insensitive Data Type |
Date | |
Msg-id | Pine.LNX.4.10.10205241500460.26467-100000@thread.crystalballinc.com Whole thread Raw |
In response to | Re: Case Insensitive Data Type (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Case Insensitive Data Type
|
List | pgsql-general |
Tom Just out of interest and i am not sure if this is possible or practical but: Could / Would it be very difficult to add ON SELECT triggers to Postgres? This could help a little. It would be as a last resort. SELECT email FROM foo WHERE email = 'ABCDE'; TRIGGER could have a part where you could then lower the ABCDE before the query was run. Just a thought Darren Ferguson On Fri, 24 May 2002, Tom Lane wrote: > "Russell Black" <russell.black@iarchives.com> writes: > > I'm using email addresses as a primary key in one of my tables. Currently,= > > I have to ensure that the email addresses are converted to lower case befo= > > re they get put into the table, and that all lookups on that field are conv= > > erted to lower case before the select statement, in order to ensure that Jo= > > e@Somewhere.com is the same as joe@somewhere.com. > > A partial answer to this is > > CREATE UNIQUE INDEX foo_key on foo (lower(email)); > > which will prevent duplicate entries with the same-up-to-case-folding > address. However you'd still have to query with queries like > > SELECT * FROM foo WHERE lower(email) = lower('probe value'); > > if the probe values you are handed can't be relied on to be downcased > already. (I tried to think of a way to insert the lower() automatically > using a rule, but couldn't come up with anything --- ideas anyone?) > > > > Does anyone know of a case-insensitive data type? > > I do not think you should go away feeling that a case-insensitive data > type would magically solve your problems. Today you might think that > case-folding is what you want, but by no later than next week you would > figure out that there are other sorts of normalizations you'll also need > to do on provided addresses --- one obvious example is stripping leading > and trailing blanks and reducing consecutive blanks to a single blank. > In fact if you really want to look like you know what you're doing, > you'll have to strip out the comment portions of an address entirely. > For example, these are all equivalent forms per RFC specs: > joe@blow.com > Joe Blow <joe@blow.com> > "Joe Blow" <joe@blow.com> > joe@blow.com (Joe Blow) > and should be recognized as such by anything that pretends to know > what email addresses are. (For that matter, you do know that the > specs say only the hostname part should be assumed case-insensitive, > don't you?) > > So the real bottom line here is that you'd better figure a way to pass > the input strings through an address-normalization function. "lower()" > might do as a first-order approximation but you won't stay with it > forever. > > > Can I create a custom data type to do this? > > You probably could, but it'd likely be more work than you want. A > minimal implementation would require an input function (which would > apply the normalization rules you want), an output function (for which > you could probably just commandeer textout), and a no-op text() coercion > function (assuming you'd like to do anything besides display the value). > Then you'd have to create a set of index operators and access method > entries, if you intended to make this indexable --- these could all > piggyback on text operators, but you'd still need to do the work of > making the catalog entries for them. > > Assuming you did have all that, I think a query like > > select * from foo where email = 'Joe Blow <joe@blow.com>' > > would indeed work the way you want --- the initially untyped string > literal would eventually get coerced to your datatype, and then your > input conversion routine could do the right things to it. But it'll > likely be easier to make it happen on the client side ;-) > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
pgsql-general by date: