Thread: Use of !~* to keep a varchar column UNIQUE case-insensitive

Use of !~* to keep a varchar column UNIQUE case-insensitive

From
"Don Morrison"
Date:
Hello All,

Is there a sane way to do this?

I have a table with a column: "name VARCHAR(40) NOT NULL UNIQUE"

Two problems with this:

1)UNIQUE is case-sensitive
2)UNIQUE is whitespace sensitive

Has anyone come up with a way, or ways to trim leading and trailing
whitespace, then do a case-insensitive uniqueness check efficiently
and semi-elegantly?

If not, I can live with it...just hoping. :)

Thanks,
Don

Re: Use of !~* to keep a varchar column UNIQUE

From
Sean Davis
Date:


On 8/18/06 12:38 PM, "Don Morrison" <donmorrison@gmail.com> wrote:

> Hello All,
>
> Is there a sane way to do this?
>
> I have a table with a column: "name VARCHAR(40) NOT NULL UNIQUE"
>
> Two problems with this:
>
> 1)UNIQUE is case-sensitive
> 2)UNIQUE is whitespace sensitive
>
> Has anyone come up with a way, or ways to trim leading and trailing
> whitespace, then do a case-insensitive uniqueness check efficiently
> and semi-elegantly?

You can define your index to be unique on a function of the column.  As an
example:

create table testtable (
    name varchar(40) not NULL
);
create unique index my_case_insensitive_index on testtable(lower(name));

insert into testtable(name) values ('John');
INSERT 0 1
insert into testtable(name) values ('john');
ERROR:  duplicate key violates unique constraint "my_case_insensitive_index"

You get the idea, hopefully.

Sean


Re: Use of !~* to keep a varchar column UNIQUE case-insensitive

From
Tom Lane
Date:
"Don Morrison" <donmorrison@gmail.com> writes:
> Has anyone come up with a way, or ways to trim leading and trailing
> whitespace, then do a case-insensitive uniqueness check efficiently
> and semi-elegantly?

Use a unique functional index, eg this handles the case sensitivity
issue:

    create unique index fooi on foo (lower(name));

Suppressing whitespace is left as an exercise for the reader...

            regards, tom lane

Re: Use of !~* to keep a varchar column UNIQUE case-insensitive

From
"Don Morrison"
Date:
> You can define your index to be unique on a function of the column.  As an
> example:
>
> create table testtable (
>    name varchar(40) not NULL
> );
> create unique index my_case_insensitive_index on testtable(lower(name));

That'll work, thanks Sean!

Re: Use of !~* to keep a varchar column UNIQUE case-insensitive

From
"Don Morrison"
Date:
>        create unique index fooi on foo (lower(name));
>
> Suppressing whitespace is left as an exercise for the reader...

create unique index fooi on foo (lower(trim both
'\040\f\t\r\n\a\b\e\v' from name));

The above works well enough, and I can live with it, but it is not
comprehensive...it would be nice to use a regular expression....and do
something like:

create unique index fooi on foo (lower(substring(name FROM '^\s*(\S*)\s*$')));

but (at least in postgres 7.4.13) the use of substring doesn't
work...I tried it and no index violation is generated even when
inserting duplicates with no spaces to begin with.  (side-note: not
sure that regex would work anyways, as I'm not sure \S would exclude
all non-printables, just some maybe)...

Re: Use of !~* to keep a varchar column UNIQUE case-insensitive

From
"Don Morrison"
Date:
> create unique index fooi on foo (lower(trim both
> '\040\f\t\r\n\a\b\e\v' from name));

Whoops, I had to shorten that to '\040\f\t\r\n\b'...I was borrowing
from regular expressions again. >:)

Re: Use of !~* to keep a varchar column UNIQUE

From
"Daniel T. Staal"
Date:
On Fri, August 18, 2006 1:52 pm, Don Morrison said:

> The above works well enough, and I can live with it, but it is not
> comprehensive...it would be nice to use a regular expression....and do
> something like:

The other way of doing this, depending on your data integrety needs, would
be to use a before trigger on insert, which could trim the space and
convert to some normalized case.  PL/Perl would be a good language for
this.  ;)

Of course that has the problem of changing the data before it is inserted.
 It is up to you to decide if this change reperesnts a problem in your
particular case.  (I would think it probably would cause the database to
run a little faster: The processing only would happen once, at insert
time, instead of whenever the index is re-created.  Probably not
significant in most cases though.)

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


Re: Use of !~* to keep a varchar column UNIQUE case-insensitive

From
Tom Lane
Date:
"Don Morrison" <donmorrison@gmail.com> writes:
> create unique index fooi on foo (lower(substring(name FROM '^\s*(\S*)\s*$')));

> but (at least in postgres 7.4.13) the use of substring doesn't
> work...

You need to double those backslashes.

            regards, tom lane

Re: Use of !~* to keep a varchar column UNIQUE

From
"Don Morrison"
Date:
On 8/18/06, Daniel T. Staal <DStaal@usa.net> wrote:
> On Fri, August 18, 2006 1:52 pm, Don Morrison said:
>
> > The above works well enough, and I can live with it, but it is not
> > comprehensive...it would be nice to use a regular expression....and do
> > something like:
>
> The other way of doing this, depending on your data integrety needs, would
> be to use a before trigger on insert, which could trim the space and
> convert to some normalized case.  PL/Perl would be a good language for
> this.  ;)

Thanks for the idea Dan; I can't normalize the case in this instance
unfortunately, because the term being entered may or may not contain
an acronym.  Thanks for the attempt though, I really appreciate the
positive response I've received from the pgsql-novice list. :)

Re: Use of !~* to keep a varchar column UNIQUE case-insensitive

From
"Don Morrison"
Date:
> > create unique index fooi on foo (lower(substring(name FROM '^\s*(\S*)\s*$')));
>
> > but (at least in postgres 7.4.13) the use of substring doesn't
> > work...
>
> You need to double those backslashes.

Thanks very much Tom, I realized something was wrong after I posted
again, but decided to refrain from posting again to reduce my "noise".
:)