Re: Case Insensitive Data Type - Mailing list pgsql-general

From Darren Ferguson
Subject Re: Case Insensitive Data Type
Date
Msg-id Pine.LNX.4.10.10205241500460.26467-100000@thread.crystalballinc.com
Whole thread Raw
In response to Re: Case Insensitive Data Type  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Case Insensitive Data Type
List pgsql-general
Tom

Just out of interest and i am not sure if this is possible or practical
but:

Could / Would it be very difficult to add ON SELECT triggers to Postgres?

This could help a little.

It would be as a last resort.

SELECT email FROM foo WHERE email = 'ABCDE';

TRIGGER could have a part where you could then lower the ABCDE before the
query was run.

Just a thought

Darren Ferguson

On Fri, 24 May 2002, Tom Lane wrote:

> "Russell Black" <russell.black@iarchives.com> writes:
> > I'm using email addresses as a primary key in one of my tables.  Currently,=
> >  I have to ensure that the email addresses are converted to lower case befo=
> > re they get put into the table, and that all lookups on that field are conv=
> > erted to lower case before the select statement, in order to ensure that Jo=
> > e@Somewhere.com is the same as joe@somewhere.com.
>
> A partial answer to this is
>
> CREATE UNIQUE INDEX foo_key on foo (lower(email));
>
> which will prevent duplicate entries with the same-up-to-case-folding
> address.  However you'd still have to query with queries like
>
> SELECT * FROM foo WHERE lower(email) = lower('probe value');
>
> if the probe values you are handed can't be relied on to be downcased
> already.  (I tried to think of a way to insert the lower() automatically
> using a rule, but couldn't come up with anything --- ideas anyone?)
>
>
> > Does anyone know of a case-insensitive data type?
>
> I do not think you should go away feeling that a case-insensitive data
> type would magically solve your problems.  Today you might think that
> case-folding is what you want, but by no later than next week you would
> figure out that there are other sorts of normalizations you'll also need
> to do on provided addresses --- one obvious example is stripping leading
> and trailing blanks and reducing consecutive blanks to a single blank.
> In fact if you really want to look like you know what you're doing,
> you'll have to strip out the comment portions of an address entirely.
> For example, these are all equivalent forms per RFC specs:
>     joe@blow.com
>     Joe Blow <joe@blow.com>
>     "Joe Blow" <joe@blow.com>
>     joe@blow.com (Joe Blow)
> and should be recognized as such by anything that pretends to know
> what email addresses are.  (For that matter, you do know that the
> specs say only the hostname part should be assumed case-insensitive,
> don't you?)
>
> So the real bottom line here is that you'd better figure a way to pass
> the input strings through an address-normalization function.  "lower()"
> might do as a first-order approximation but you won't stay with it
> forever.
>
> > Can I create a custom data type to do this?
>
> You probably could, but it'd likely be more work than you want.  A
> minimal implementation would require an input function (which would
> apply the normalization rules you want), an output function (for which
> you could probably just commandeer textout), and a no-op text() coercion
> function (assuming you'd like to do anything besides display the value).
> Then you'd have to create a set of index operators and access method
> entries, if you intended to make this indexable --- these could all
> piggyback on text operators, but you'd still need to do the work of
> making the catalog entries for them.
>
> Assuming you did have all that, I think a query like
>
>     select * from foo where email = 'Joe Blow <joe@blow.com>'
>
> would indeed work the way you want --- the initially untyped string
> literal would eventually get coerced to your datatype, and then your
> input conversion routine could do the right things to it.  But it'll
> likely be easier to make it happen on the client side ;-)
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>


pgsql-general by date:

Previous
From: Gregory Seidman
Date:
Subject: Re: Question on crypt password
Next
From: Doug Fields
Date:
Subject: Re: Altering existing table to be WITHOUT OIDs