Re: Case-Insensitve Text Comparison - Mailing list pgsql-hackers

From David E. Wheeler
Subject Re: Case-Insensitve Text Comparison
Date
Msg-id 172A4089-7D72-4AA3-8F3A-59BED6F92119@kineticode.com
Whole thread Raw
In response to Re: Case-Insensitve Text Comparison  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Jun 2, 2008, at 09:33, Tom Lane wrote:

>> Would the use of str_tolower() in formatting.c fix that?
>
> Yeah, you need something equivalent to that.  I think that whole area
> is due for refactoring, though --- we've got kind of a weird
> collection
> of upper/lower/initcap APIs spread through a couple of different
> files.

And I just ran into this on 8.3 when trying to install citext:
  psql:citext.sql:350: ERROR:  there is no built-in function named
"oid_text"

I'm assuming that this is because a lot of automatic casts were
removed in 8.3 or 8.2; There are a bunch of these:

CREATE FUNCTION citext(oid) RETURNS citext AS 'oid_text'  LANGUAGE
'internal' IMMUTABLE STRICT;
GRANT EXECUTE ON FUNCTION citext(oid) TO PUBLIC;
COMMENT ON FUNCTION citext(oid) IS 'convert oid to citext';
CREATE FUNCTION oid(citext) RETURNS oid AS 'text_oid'  LANGUAGE
'internal' IMMUTABLE STRICT;
GRANT EXECUTE ON FUNCTION oid(citext) TO PUBLIC;
COMMENT ON FUNCTION oid(citext) IS 'convert citext to oid';
CREATE CAST (citext AS oid) WITH FUNCTION oid(citext);
CREATE CAST (oid AS citext) WITH FUNCTION citext(oid);

CREATE FUNCTION citext(int2) RETURNS citext AS 'int2_text'  LANGUAGE
'internal' IMMUTABLE STRICT;
GRANT EXECUTE ON FUNCTION citext(int2) TO PUBLIC;
COMMENT ON FUNCTION citext(int2) IS 'convert int2 to citext';
CREATE FUNCTION int2(citext) RETURNS int2 AS 'text_int2'  LANGUAGE
'internal' IMMUTABLE STRICT;
GRANT EXECUTE ON FUNCTION int2(citext) TO PUBLIC;
COMMENT ON FUNCTION int2(citext) IS 'convert citext to int2';
CREATE CAST (citext AS int2) WITH FUNCTION int2(citext);
CREATE CAST (int2 AS citext) WITH FUNCTION citext(int2);

And on and on. Clearly this module needs updating for newer
PostgreSQLs. I tried removing them all in order to get the data type
and tried it out with this script:

my $dbh = DBI->connect('dbi:Pg:dbname=try', 'postgres', '',
{ pg_enable_utf8 => 1 });
for my $char qw( À Á Â Ã Ä Å Ç Ć Č Ĉ Ċ Ď Đ A B C D ) {    print "$char: ", $dbh->selectrow_array('SELECT
LOWER(?::citext)',  
undef, $char ), $/;
}

Naturally it didn't work:

À: Ã
Á: á
Â: â
Ã: ã
Ä: ä
Å: Ã¥
Ç: ç
Ć: ć
Č: č
Ĉ: ĉ
Ċ: ċ
Ď: ď
Đ: đ
A: a
B: b
C: c
D: d

BTW, I rebuilt my cluster with --locale en_US.UTF-8 and the script
works on a text type, so having a locale is key.

Thanks,

David

pgsql-hackers by date:

Previous
From: Chris Browne
Date:
Subject: Re: Core team statement on replication in PostgreSQL
Next
From: Tom Lane
Date:
Subject: Re: Proposal: new function array_init