Re: PATCH: CITEXT 2.0 - Mailing list pgsql-hackers

From David E. Wheeler
Subject Re: PATCH: CITEXT 2.0
Date
Msg-id BAC3302A-D88F-49C9-9E7D-596C347D9649@kineticode.com
Whole thread Raw
In response to Re: PATCH: CITEXT 2.0  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: PATCH: CITEXT 2.0  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Jul 7, 2008, at 13:59, Gregory Stark wrote:

> Of course the obvious case of two equivalent strings with different  
> bytes
> would be two strings which differ only in case in a collation which  
> doesn't
> distinguish based on case. So you obviously can't take this route  
> for citext.

Well, to be fair, citext isn't imposing a collation. It's just calling  
str_tolower() on strings before passing them on to varstr_cmp() or  
strncmp() to compare.

> I don't think you have to worry about the problem that cause  
> Postgres to make
> this change. IIRC it was someone comparing strings like paths and  
> usernames
> and getting false positives because they were in a Turkish locale  
> which found
> certain sequences of characters to be insignificant for ordering.  
> Someone
> who's using a citext data type has obviously decided that's  
> precisely the kind
> of behaviour they want.

Hrm. So in your opinion, strncmp() could be used for all comparisons  
by citext, rather than varstr_cmp()?

Thanks,

David



pgsql-hackers by date:

Previous
From: Gregory Stark
Date:
Subject: Re: PATCH: CITEXT 2.0
Next
From: "Andrew Hammond"
Date:
Subject: Re: the un-vacuumable table