Re: BUG #16570: Collation not working - Mailing list pgsql-bugs

From Arnaud Perrier
Subject Re: BUG #16570: Collation not working
Date
Msg-id CAPWCOokPW=kv=BEHVLFcopv7YmZ_NsPuxAtUc-UjKa7awgxLfg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #16570: Collation not working  ("Daniel Verite" <daniel@manitou-mail.org>)
Responses Re: BUG #16570: Collation not working  ("Daniel Verite" <daniel@manitou-mail.org>)
List pgsql-bugs
Daniel,

Thanks for your precise explanation!

I double checked and Windows installer comes with ICU 53 while Linux Centos 8 comes with ICU 60.

On Linux Centos 8 with ICU 60, the collation works but only for the first argument of the ORDER BY clause (see below test requests 5 and 6).

Do I need to create a separate issue for that ?

1/ SELECT * FROM TBL;
1/ OK = TEXT1 + TEXT2 = digits before letters

2/ SELECT * FROM TBL ORDER BY TEXT1, TEXT2;
2/ OK = TEXT1 + TEXT2 = digits before letters

3/ SELECT * FROM TBL ORDER BY TEXT1 COLLATE digitslast;
3/ OK = TEXT1 = digits after letters + TEXT2 = digits before letters

4/ SELECT * FROM TBL ORDER BY TEXT2 COLLATE digitslast;
4/ OK = TEXT1 = digits before letters + TEXT2 = digits after letters

5/ SELECT * FROM TBL ORDER BY TEXT1 COLLATE digitslast, TEXT2 COLLATE
digitslast;
5/ KO = TEXT1 = digits after letters + TEXT2 = digits before letters

6/ SELECT * FROM TBL ORDER BY TEXT2 COLLATE digitslast, TEXT1 COLLATE
digitslast; 
6/ KO = TEXT1 = digits before letters + TEXT2 = digits after letters  

Best regards,
Arnaud


Le mar. 4 août 2020 à 16:32, Daniel Verite <daniel@manitou-mail.org> a écrit :
        Arnaud Perrier wrote:

> It does not work with locale = 'en@colReorder=latn-digit' on Windows /
> Linux also.
> I've got the same results as mentioned previously.

For ICU pre-54 versions, the parsing of collation attributes are
done by Postgres. Checking with the current source code,
it appears not to support colReorder.

See src/backend/utils/adt/pg_locale.c:

/*
 * Parse collation attributes and apply them to the open collator.  This
takes
 * a string like "und@colStrength=primary;colCaseLevel=yes" and parses and
 * applies the key-value arguments.
 *
 * Starting with ICU version 54, the attributes are processed automatically
by
 * ucol_open(), so this is only necessary for emulating this behavior on
older
 * versions.
 */
pg_attribute_unused()
static void
icu_set_collation_attributes(UCollator *collator, const char *loc)
{
[...]

      if (strcmp(name, "colstrength") == 0)
              uattr = UCOL_STRENGTH;
      else if (strcmp(name, "colbackwards") == 0)
              uattr = UCOL_FRENCH_COLLATION;
      else if (strcmp(name, "colcaselevel") == 0)
              uattr = UCOL_CASE_LEVEL;
      else if (strcmp(name, "colcasefirst") == 0)
              uattr = UCOL_CASE_FIRST;
      else if (strcmp(name, "colalternate") == 0)
              uattr = UCOL_ALTERNATE_HANDLING;
      else if (strcmp(name, "colnormalization") == 0)
              uattr = UCOL_NORMALIZATION_MODE;
      else if (strcmp(name, "colnumeric") == 0)
              uattr = UCOL_NUMERIC_COLLATION;
      else
              /* ignore if unknown */
              continue;
[...]

So with ICU-53, it seems to me that there's no way to use colReorder.


> 1/ Is-it possible to request a newer ICU version for Postgresql 13 builds
> (for all platforms) and how to do it ?

I've asked in [1] for Windows. Other platforms tend to provide ICU as
a separate package, and Postgres gets built with whatever is current
for that platform/OS version/distribution at the time of its release.
BTW Centos 8 is reasonably recent, it seems to go with ICU-60,
so if it there fails too, that should be investigated separately.

> 2/ Is-it possible to request for a "debug" mode for ICU feature to see when
> it is applied/ignored ?

Some kind of "EXPLAIN COLLATION" would be useful indeed.


[1]
https://www.postgresql.org/message-id/43f9b6b6-6fae-47bb-8f4d-8e84af354523@manitou-mail.org


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #16570: Collation not working
Next
From: Guillaume Lelarge
Date:
Subject: Re: BUG #16572: pgadmin change column order