Re: BUG #16570: Collation not working - Mailing list pgsql-bugs
From | Arnaud Perrier |
---|---|
Subject | Re: BUG #16570: Collation not working |
Date | |
Msg-id | CAPWCOokPW=kv=BEHVLFcopv7YmZ_NsPuxAtUc-UjKa7awgxLfg@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #16570: Collation not working ("Daniel Verite" <daniel@manitou-mail.org>) |
Responses |
Re: BUG #16570: Collation not working
|
List | pgsql-bugs |
Daniel,
Thanks for your precise explanation!
I double checked and Windows installer comes with ICU 53 while Linux Centos 8 comes with ICU 60.
On Linux Centos 8 with ICU 60, the collation works but only for the first argument of the ORDER BY clause (see below test requests 5 and 6).
Do I need to create a separate issue for that ?
1/ SELECT * FROM TBL;
1/ OK = TEXT1 + TEXT2 = digits before letters
1/ OK = TEXT1 + TEXT2 = digits before letters
2/ SELECT * FROM TBL ORDER BY TEXT1, TEXT2;
2/ OK = TEXT1 + TEXT2 = digits before letters
2/ OK = TEXT1 + TEXT2 = digits before letters
3/ SELECT * FROM TBL ORDER BY TEXT1 COLLATE digitslast;
3/ OK = TEXT1 = digits after letters + TEXT2 = digits before letters
3/ OK = TEXT1 = digits after letters + TEXT2 = digits before letters
4/ SELECT * FROM TBL ORDER BY TEXT2 COLLATE digitslast;
4/ OK = TEXT1 = digits before letters + TEXT2 = digits after letters
4/ OK = TEXT1 = digits before letters + TEXT2 = digits after letters
5/ SELECT * FROM TBL ORDER BY TEXT1 COLLATE digitslast, TEXT2 COLLATE
digitslast;
5/ KO = TEXT1 = digits after letters + TEXT2 = digits before letters
digitslast;
5/ KO = TEXT1 = digits after letters + TEXT2 = digits before letters
6/ SELECT * FROM TBL ORDER BY TEXT2 COLLATE digitslast, TEXT1 COLLATE
digitslast;
digitslast;
6/ KO = TEXT1 = digits before letters + TEXT2 = digits after letters
Best regards,
Arnaud
Le mar. 4 août 2020 à 16:32, Daniel Verite <daniel@manitou-mail.org> a écrit :
Arnaud Perrier wrote:
> It does not work with locale = 'en@colReorder=latn-digit' on Windows /
> Linux also.
> I've got the same results as mentioned previously.
For ICU pre-54 versions, the parsing of collation attributes are
done by Postgres. Checking with the current source code,
it appears not to support colReorder.
See src/backend/utils/adt/pg_locale.c:
/*
* Parse collation attributes and apply them to the open collator. This
takes
* a string like "und@colStrength=primary;colCaseLevel=yes" and parses and
* applies the key-value arguments.
*
* Starting with ICU version 54, the attributes are processed automatically
by
* ucol_open(), so this is only necessary for emulating this behavior on
older
* versions.
*/
pg_attribute_unused()
static void
icu_set_collation_attributes(UCollator *collator, const char *loc)
{
[...]
if (strcmp(name, "colstrength") == 0)
uattr = UCOL_STRENGTH;
else if (strcmp(name, "colbackwards") == 0)
uattr = UCOL_FRENCH_COLLATION;
else if (strcmp(name, "colcaselevel") == 0)
uattr = UCOL_CASE_LEVEL;
else if (strcmp(name, "colcasefirst") == 0)
uattr = UCOL_CASE_FIRST;
else if (strcmp(name, "colalternate") == 0)
uattr = UCOL_ALTERNATE_HANDLING;
else if (strcmp(name, "colnormalization") == 0)
uattr = UCOL_NORMALIZATION_MODE;
else if (strcmp(name, "colnumeric") == 0)
uattr = UCOL_NUMERIC_COLLATION;
else
/* ignore if unknown */
continue;
[...]
So with ICU-53, it seems to me that there's no way to use colReorder.
> 1/ Is-it possible to request a newer ICU version for Postgresql 13 builds
> (for all platforms) and how to do it ?
I've asked in [1] for Windows. Other platforms tend to provide ICU as
a separate package, and Postgres gets built with whatever is current
for that platform/OS version/distribution at the time of its release.
BTW Centos 8 is reasonably recent, it seems to go with ICU-60,
so if it there fails too, that should be investigated separately.
> 2/ Is-it possible to request for a "debug" mode for ICU feature to see when
> it is applied/ignored ?
Some kind of "EXPLAIN COLLATION" would be useful indeed.
[1]
https://www.postgresql.org/message-id/43f9b6b6-6fae-47bb-8f4d-8e84af354523@manitou-mail.org
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite
pgsql-bugs by date: