Home > mailing lists

Re: PATCH: CITEXT 2.0 - Mailing list pgsql-hackers

From	David E. Wheeler
Subject	Re: PATCH: CITEXT 2.0
Date	July 6, 2008 00:47:08
Msg-id	DD2B2B80-66FD-46B7-9D5B-0AF94C264E55@kineticode.com Whole thread Raw
In response to	Re: PATCH: CITEXT 2.0 (Gregory Stark <stark@enterprisedb.com>)
List	pgsql-hackers

Tree view

On Jul 5, 2008, at 02:58, Gregory Stark wrote:

>>    txt = cilower( PG_GETARG_TEXT_PP(0) );
>>    str = VARDATA_ANY(txt);
>>
>>    result = hash_any((unsigned char *) str, VARSIZE_ANY_EXHDR(txt));
>
> I thought your data type implemented a locale dependent collation,
> not just
> a case insensitive collation. That is, does this hash agree with your
> citext_eq on strings like "foo bar" <=> "foobar" and "fooß" <=>
> "fooss" ?

CITEXT is basically intended to replace all those queries that do
`WHERE LOWER(col) = LOWER(?)` by doing it internally. That's it. It's
locale-aware to the same extent that `LOWER()` is (and that citext 1.0
is not, since it only compares ASCII characters case-insensitively).
And I expect that it does, in fact, agree with your examples, in that
all the current tests for = and <> pass:

try=# select 'foo bar' = 'foobar'; ?column?
---------- f

try=# SELECT 'fooß' = 'fooss'; ?column?
---------- f

> You may have to use strxfrm

In the patch against CVS HEAD, it uses str_tolower() in formatting.c.

Best,

David

pgsql-hackers by date:

From: "David E. Wheeler"
Date: 06 July 2008, 00:47:07
Subject: Re: PATCH: CITEXT 2.0

From: "David E. Wheeler"
Date: 06 July 2008, 00:47:28
Subject: Re: PATCH: CITEXT 2.0

Re: PATCH: CITEXT 2.0 - Mailing list pgsql-hackers

Previous

Next