On Wed, 2 Sep 2020 14:06:18 +0200
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
> On 2020-06-10 00:29, Jehan-Guillaume de Rorthais wrote:
> > In the meantime, I've been working on various workarounds. The only one I
> > found is to use "fr-u-kr-latn-digit-kn" instead of "fr-u-kr-latn-digit".
> > Unfortunately, the two collations are not equivalent, but I believe it
> > might be useful in many case.
>
> What precisely is broken in the ICU library?
Using ucol_strcoll/ucol_strcollUTF8 with a custom collation sorting digits after
latn.
> All the examples so far refer to kr-latn-digit. Are all reorderings broken,
> or something specifically related to latn and/or digit?
I don't know. So far, I only found a couple of reports (mine included) using
kr-latn-digit in different languages. And as I wrote, kr-latn-digit-kn doesn't
seem affected. So all reorderings might not be broken.
But I have no strong facts about this, just tests.
> Are any collation customizations other than reorderings affected?
I didn't poke around to try some other random customizations. The answer lies
somewhere in the ICU codebase. I suppose we'll be able to answer this question
as soon as the bug will be explained.
However, the bug reported here are all about sorting: wrong result order and/or
wrong result because of badly sorted index.
Maybe Daniel has some more experience feedback with other customizations as he
seems to work extensively with ICU and PostgreSQL?
Regards,