Re: PATCH: CITEXT 2.0 - Mailing list pgsql-hackers

From Zdenek Kotala
Subject Re: PATCH: CITEXT 2.0
Date
Msg-id 4871C9C2.8040307@sun.com
Whole thread Raw
In response to Re: PATCH: CITEXT 2.0  ("David E. Wheeler" <david@kineticode.com>)
Responses Re: PATCH: CITEXT 2.0  ("David E. Wheeler" <david@kineticode.com>)
List pgsql-hackers
David E. Wheeler napsal(a):
> Replying to myself, but I've made some local changes (see other 
> messages) and just wanted to follow up on some of my own comments.
> 
> On Jul 2, 2008, at 21:38, David E. Wheeler wrote:
> 
>>> 4) Operator =  citext_eq is not correct. See comment 
>>> http://doxygen.postgresql.org/varlena_8c.html#8621d064d14f259c594e4df3c1a64cac 
>>>
>>
>> So should citextcmp() call strncmp() instead of varst_cmp()? The 
>> latter is what I saw in varlena.c.
> 
> I'm guessing that the answer is "no," since varstr_cmp() uses strncmp() 
> internally, as appropriate to the locale. Correct?

You have to use varstr_cmp  in citextcmp. Your code is correct, because for
< <= >= > operators you need collation sensible function.

You need to change only citext_cmp function to use strncmp() or call texteq 
function.

>>> There must be difference between equality and collation for example 
>>> in Czech language 'láska' and 'laská' are different word it means 
>>> that 'láska' != 'laská'. But there is no difference in collation 
>>> order. See Unicode Universal Collation Algorithm for detail.
>>
>> I'll leave the collation stuff to the functions I call (*far* from my 
>> specialty), but I'll add a test for this and make sure it works as 
>> expected. Um, although, with what collation should it be tested? The 
>> tests I wrote assume en_US.UTF-8.
> 
> I added this test and is passes:
> 
> SELECT isnt( 'láska'::citext, 'laská'::citext, 'Diffrent accented 
> characters should not be equivalent' );

I'm think that this test will work correctly for en_US.UTF-8 at any time. I 
guess the test make sense only when Czech collation (cs_CZ.UTF-8) is selected, 
but unfortunately, you cannot change collation during your test :(.

I think, Best solution for now is to keep the test and add comment about 
recommended collation for this test.

    Zdenek


pgsql-hackers by date:

Previous
From: Yoshiyuki Asaba
Date:
Subject: Re: [PATCHES] WITH RECURSIVE updated to CVS TIP
Next
From: Andrew Dunstan
Date:
Subject: Re: pg_ctl -w with postgresql.conf in non-default path