Re: PATCH: CITEXT 2.0 - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: PATCH: CITEXT 2.0
Date
Msg-id 87mykt9zrv.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: PATCH: CITEXT 2.0  ("David E. Wheeler" <david@kineticode.com>)
Responses Re: PATCH: CITEXT 2.0  ("David E. Wheeler" <david@kineticode.com>)
List pgsql-hackers
"David E. Wheeler" <david@kineticode.com> writes:

> On Jul 7, 2008, at 12:21, David E. Wheeler wrote:
>
>> My question is: why? Shouldn't they all use the same function for
>> comparison? I'm happy to dupe this implementation for citext, but I  don't
>> understand it. Should not all comparisons be executed  consistently?
>
> Let me try to answer my own question by citing this comment:
>
>     /*
>      * Since we only care about equality or not-equality, we can avoid  all
> the
>      * expense of strcoll() here, and just do bitwise comparison.
>      */
>
> So, the upshot is that the = and <> operators are not locale-aware,  yes? They
> just do byte comparisons. Is that really the way it should  be? I mean, could
> there not be strings that are equivalent but have  different bytes?

There could be strings that strcoll returns 0 for even though they're not
identical. However that caused problems in Postgres so we decided that only
equal strings should actually compare equal. So if strcoll returns 0 then we
do a bytewise comparison to impose an arbitrary ordering.

Of course the obvious case of two equivalent strings with different bytes
would be two strings which differ only in case in a collation which doesn't
distinguish based on case. So you obviously can't take this route for citext.

I don't think you have to worry about the problem that cause Postgres to make
this change. IIRC it was someone comparing strings like paths and usernames
and getting false positives because they were in a Turkish locale which found
certain sequences of characters to be insignificant for ordering. Someone
who's using a citext data type has obviously decided that's precisely the kind
of behaviour they want.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!


pgsql-hackers by date:

Previous
From: "David E. Wheeler"
Date:
Subject: Re: PATCH: CITEXT 2.0
Next
From: "David E. Wheeler"
Date:
Subject: Re: PATCH: CITEXT 2.0