Re: Camel case identifiers and folding - Mailing list pgsql-general

From Andrew Gierth
Subject Re: Camel case identifiers and folding
Date
Msg-id 87bm2a6233.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to Re: Camel case identifiers and folding  (Morris de Oryx <morrisdeoryx@gmail.com>)
List pgsql-general
>>>>> "Morris" == Morris de Oryx <morrisdeoryx@gmail.com> writes:

 Morris> UUIDs as a type are an interesting case in Postgres. They're
 Morris> stored as a large numeric for efficiency (good!), but are
 Morris> presented by default in the 36-byte format with the dashes.
 Morris> However, you can also search using the dashes 32-character
 Morris> format....and it all works. Case-insensitively.

That works because UUIDs have a convenient canonical form (the raw
bytes) which all input is converted to before comparison.

Text is ... not like this.

Even citext is really only a hack - it assumes that comparisons can be
done by conversion to lowercase, which may work well enough for English
but I'm pretty sure it does not correctly handle the edge cases in, for
example, German (consider 'SS', 'ss', 'ß') or Greek (final sigma). Doing
it better would require proper application of case-folding rules, and
even that would require handling of edge cases (the Unicode case folding
algorithm is designed to be language-independent, which means that it
breaks for Turkish without special-case exceptions).

--
Andrew (irc:RhodiumToad)


pgsql-general by date:

Previous
From: Rob Sargent
Date:
Subject: Re: Camel case identifiers and folding
Next
From: "Peter J. Holzer"
Date:
Subject: Re: Camel case identifiers and folding