Re: BUG #15285: Query used index over field with ICU collation in some cases wrongly return 0 rows - Mailing list pgsql-bugs

From Jehan-Guillaume de Rorthais
Subject Re: BUG #15285: Query used index over field with ICU collation in some cases wrongly return 0 rows
Date
Msg-id 20200902145550.6a5014fb@firost
Whole thread Raw
In response to Re: BUG #15285: Query used index over field with ICU collation in some cases wrongly return 0 rows  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: BUG #15285: Query used index over field with ICU collation in some cases wrongly return 0 rows
List pgsql-bugs
On Wed, 2 Sep 2020 14:06:18 +0200
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

> On 2020-06-10 00:29, Jehan-Guillaume de Rorthais wrote:
> > In the meantime, I've been working on various workarounds. The only one I
> > found is to use "fr-u-kr-latn-digit-kn" instead of "fr-u-kr-latn-digit".
> > Unfortunately, the two collations are not equivalent, but I believe it
> > might be useful in many case.  
> 
> What precisely is broken in the ICU library? 

Using ucol_strcoll/ucol_strcollUTF8 with a custom collation sorting digits after
latn.

> All the examples so far refer to kr-latn-digit.  Are all reorderings broken,
> or something specifically related to latn and/or digit? 

I don't know. So far, I only found a couple of reports (mine included) using
kr-latn-digit in different languages. And as I wrote, kr-latn-digit-kn doesn't
seem affected. So all reorderings might not be broken.

But I have no strong facts about this, just tests.

> Are any collation customizations other than reorderings affected?

I didn't poke around to try some other random customizations. The answer lies
somewhere in the ICU codebase. I suppose we'll be able to answer this question
as soon as the bug will be explained.

However, the bug reported here are all about sorting: wrong result order and/or
wrong result because of badly sorted index.

Maybe Daniel has some more experience feedback with other customizations as he
seems to work extensively with ICU and PostgreSQL?

Regards,



pgsql-bugs by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: BUG #15285: Query used index over field with ICU collation in some cases wrongly return 0 rows
Next
From: Bruce Momjian
Date:
Subject: Re: BUG #16486: Prompted password is ignored when password specified in connection string