On Thu, Nov 27, 2014 at 7:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> +1 ... this seems like a nice end-run around the backwards compatibility
> problem.
>
> Another issue is that (AFAIK) ICU doesn't support any non-Unicode
> encodings, which means that a build supporting *only* ICU collations is a
> nonstarter IMO. So we really need a way to deal with both system and ICU
> collations, and treating the latter as a separate subset of pg_collation
> seems like a decent way to do that. (ISTR some discussion about forcibly
> converting strings in other encodings to Unicode to compare them, but
> I sure don't want to do that. I think it'd be saner just to mark the
> ICU collations as only compatible with UTF8 database encoding.)
I would like to see ICU become the defacto standard set of collations,
with support for *versioning*, in the same way that UTF-8 might be
considered the defacto standard encoding.
It seems likely that we'll want to store sort keys (strxfrm() blobs)
in indexes at some point in the future. I now believe that that's more
problematic than just using strcoll() in B-Tree support function 1.
Although that isn't the most compelling reason to pursue ICU support.
--
Peter Geoghegan