Thread: Use of !~* to keep a varchar column UNIQUE case-insensitive
Hello All, Is there a sane way to do this? I have a table with a column: "name VARCHAR(40) NOT NULL UNIQUE" Two problems with this: 1)UNIQUE is case-sensitive 2)UNIQUE is whitespace sensitive Has anyone come up with a way, or ways to trim leading and trailing whitespace, then do a case-insensitive uniqueness check efficiently and semi-elegantly? If not, I can live with it...just hoping. :) Thanks, Don
On 8/18/06 12:38 PM, "Don Morrison" <donmorrison@gmail.com> wrote: > Hello All, > > Is there a sane way to do this? > > I have a table with a column: "name VARCHAR(40) NOT NULL UNIQUE" > > Two problems with this: > > 1)UNIQUE is case-sensitive > 2)UNIQUE is whitespace sensitive > > Has anyone come up with a way, or ways to trim leading and trailing > whitespace, then do a case-insensitive uniqueness check efficiently > and semi-elegantly? You can define your index to be unique on a function of the column. As an example: create table testtable ( name varchar(40) not NULL ); create unique index my_case_insensitive_index on testtable(lower(name)); insert into testtable(name) values ('John'); INSERT 0 1 insert into testtable(name) values ('john'); ERROR: duplicate key violates unique constraint "my_case_insensitive_index" You get the idea, hopefully. Sean
"Don Morrison" <donmorrison@gmail.com> writes: > Has anyone come up with a way, or ways to trim leading and trailing > whitespace, then do a case-insensitive uniqueness check efficiently > and semi-elegantly? Use a unique functional index, eg this handles the case sensitivity issue: create unique index fooi on foo (lower(name)); Suppressing whitespace is left as an exercise for the reader... regards, tom lane
> You can define your index to be unique on a function of the column. As an > example: > > create table testtable ( > name varchar(40) not NULL > ); > create unique index my_case_insensitive_index on testtable(lower(name)); That'll work, thanks Sean!
> create unique index fooi on foo (lower(name)); > > Suppressing whitespace is left as an exercise for the reader... create unique index fooi on foo (lower(trim both '\040\f\t\r\n\a\b\e\v' from name)); The above works well enough, and I can live with it, but it is not comprehensive...it would be nice to use a regular expression....and do something like: create unique index fooi on foo (lower(substring(name FROM '^\s*(\S*)\s*$'))); but (at least in postgres 7.4.13) the use of substring doesn't work...I tried it and no index violation is generated even when inserting duplicates with no spaces to begin with. (side-note: not sure that regex would work anyways, as I'm not sure \S would exclude all non-printables, just some maybe)...
> create unique index fooi on foo (lower(trim both > '\040\f\t\r\n\a\b\e\v' from name)); Whoops, I had to shorten that to '\040\f\t\r\n\b'...I was borrowing from regular expressions again. >:)
On Fri, August 18, 2006 1:52 pm, Don Morrison said: > The above works well enough, and I can live with it, but it is not > comprehensive...it would be nice to use a regular expression....and do > something like: The other way of doing this, depending on your data integrety needs, would be to use a before trigger on insert, which could trim the space and convert to some normalized case. PL/Perl would be a good language for this. ;) Of course that has the problem of changing the data before it is inserted. It is up to you to decide if this change reperesnts a problem in your particular case. (I would think it probably would cause the database to run a little faster: The processing only would happen once, at insert time, instead of whenever the index is re-created. Probably not significant in most cases though.) Daniel T. Staal --------------------------------------------------------------- This email copyright the author. Unless otherwise noted, you are expressly allowed to retransmit, quote, or otherwise use the contents for non-commercial purposes. This copyright will expire 5 years after the author's death, or in 30 years, whichever is longer, unless such a period is in excess of local copyright law. ---------------------------------------------------------------
"Don Morrison" <donmorrison@gmail.com> writes: > create unique index fooi on foo (lower(substring(name FROM '^\s*(\S*)\s*$'))); > but (at least in postgres 7.4.13) the use of substring doesn't > work... You need to double those backslashes. regards, tom lane
On 8/18/06, Daniel T. Staal <DStaal@usa.net> wrote: > On Fri, August 18, 2006 1:52 pm, Don Morrison said: > > > The above works well enough, and I can live with it, but it is not > > comprehensive...it would be nice to use a regular expression....and do > > something like: > > The other way of doing this, depending on your data integrety needs, would > be to use a before trigger on insert, which could trim the space and > convert to some normalized case. PL/Perl would be a good language for > this. ;) Thanks for the idea Dan; I can't normalize the case in this instance unfortunately, because the term being entered may or may not contain an acronym. Thanks for the attempt though, I really appreciate the positive response I've received from the pgsql-novice list. :)
> > create unique index fooi on foo (lower(substring(name FROM '^\s*(\S*)\s*$'))); > > > but (at least in postgres 7.4.13) the use of substring doesn't > > work... > > You need to double those backslashes. Thanks very much Tom, I realized something was wrong after I posted again, but decided to refrain from posting again to reduce my "noise". :)