Re: [pgsql-packagers] Palle Girgensohn's ICU patch - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [pgsql-packagers] Palle Girgensohn's ICU patch
Date
Msg-id CAM3SWZSwzAPmjKncxpTnaxUcL0Q9KcEq2A7WD8oRVGLMWpjxHw@mail.gmail.com
Whole thread Raw
In response to Re: [pgsql-packagers] Palle Girgensohn's ICU patch  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Nov 27, 2014 at 7:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> +1 ... this seems like a nice end-run around the backwards compatibility
> problem.
>
> Another issue is that (AFAIK) ICU doesn't support any non-Unicode
> encodings, which means that a build supporting *only* ICU collations is a
> nonstarter IMO.  So we really need a way to deal with both system and ICU
> collations, and treating the latter as a separate subset of pg_collation
> seems like a decent way to do that.  (ISTR some discussion about forcibly
> converting strings in other encodings to Unicode to compare them, but
> I sure don't want to do that.  I think it'd be saner just to mark the
> ICU collations as only compatible with UTF8 database encoding.)

I would like to see ICU become the defacto standard set of collations,
with support for *versioning*, in the same way that UTF-8 might be
considered the defacto standard encoding.

It seems likely that we'll want to store sort keys (strxfrm() blobs)
in indexes at some point in the future. I now believe that that's more
problematic than just using strcoll() in B-Tree support function 1.
Although that isn't the most compelling reason to pursue ICU support.
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: no test programs in contrib
Next
From: Tatsuo Ishii
Date:
Subject: Re: [pgsql-packagers] Palle Girgensohn's ICU patch