Re: ICU_LOCALE set database default icu collation but not working as intended. - Mailing list pgsql-hackers

From jian he
Subject Re: ICU_LOCALE set database default icu collation but not working as intended.
Date
Msg-id CACJufxExp4mYBrwzCO9P=keyKgAVyTnqxMbuw=v_=FX1sugzkg@mail.gmail.com
Whole thread Raw
In response to Re: ICU_LOCALE set database default icu collation but not working as intended.  ("Daniel Verite" <daniel@manitou-mail.org>)
Responses Re: ICU_LOCALE set database default icu collation but not working as intended.
List pgsql-hackers
Hi, here are some other trigger cases.

CREATE DATABASE dbicu3 LOCALE_PROVIDER icu LOCALE 'en_US.UTF-8'
    ICU_LOCALE 'en-u-kr-latn-digit-kf-upper-kn-true' TEMPLATE 'template0';
CREATE DATABASE dbicu4 LOCALE_PROVIDER icu LOCALE 'en_US.UTF-8'
    ICU_LOCALE 'en-u-kr-latn-digit-kn-true' TEMPLATE 'template0';
--mistake
CREATE DATABASE dbicu5 LOCALE_PROVIDER icu LOCALE 'en_US.UTF-8'
    ICU_LOCALE 'en-u-kr-latn-digit-kr-upper' TEMPLATE 'template0';
CREATE DATABASE dbicu6 LOCALE_PROVIDER icu LOCALE 'en_US.UTF-8'
    ICU_LOCALE 'en-u-kr-latn-digit-kf-upper' TEMPLATE 'template0';

--same script applies to dbicu3, dbicu4, dbicu5, dbicu6.
begin;
CREATE COLLATION upperfirst (provider = icu, locale = 'en-u-kf-upper');
CREATE TABLE icu(def text, en text COLLATE "en_US", upfirst text COLLATE upperfirst, test_kr text);
INSERT INTO icu VALUES ('a', 'a', 'a', '1 a'), ('b','b','b', 'A 11'), ('A','A','A','A 19'), ('B','B','B', '8 p');
INSERT INTO icu VALUES ('a', 'a', 'a', 'a 7'),('a', 'a', 'a', 'a 117');
INSERT INTO icu VALUES ('a', 'a', 'a', 'a 70'), ('a', 'a', 'a', 'A 70');
INSERT INTO icu VALUES ('a', 'a', 'a', 'Œ 1');
commit ;
-----------------------
localhost:5433 admin@dbicu3=# SELECT test_kr FROM icu  ORDER BY test_kr ;

 test_kr
---------
 a 7
 A 11
 A 19
 A 70
 a 70
 a 117
 Œ 1
 1 a
 8 p
(9 rows)
--------------------------------------
localhost:5433 admin@dbicu4=# SELECT test_kr FROM icu  ORDER BY test_kr ;
 test_kr
---------
 a 7
 A 11
 A 19
 a 70
 A 70
 a 117
 Œ 1
 1 a
 8 p
(9 rows)
------------------------------------------------------------------------
localhost:5433 admin@dbicu6=#  SELECT test_kr FROM icu  ORDER BY test_kr ;
 test_kr
---------
 A 11
 a 117
 A 19
 a 7
 A 70
 a 70
 Œ 1
 1 a
 8 p
(9 rows)
-----------------------------------------------------------------------------
  • dbicu3, ICU_LOCALE 'en-u-kr-latn-digit-kf-upper-kn-true'  seems 'kf-upper' not grouped strings beginning with character 'A' together? 
  • dbicu4, ICU_LOCALE 'en-u-kr-latn-digit-kn-true'  since upper/lower not explicitly mentioned, and since the collation is deterministic, so character  'A' should be grouped together first then do the numeric value comparison.
  •  dbicu6, ICU_LOCALE 'en-u-kr-latn-digit-kf-upper' , from the result, kr-latn-digit is working as intended. But kf-upper seems not working. 

Can I specify as many key-value settings options (https://www.unicode.org/reports/tr35/tr35-collation.html#table-collation-settings) as I want in ICU_LOCALE while I create a new database?


On Fri, May 27, 2022 at 1:44 AM Daniel Verite <daniel@manitou-mail.org> wrote:
        jian he wrote:

> CREATE
> DATABASE dbicu1 LOCALE_PROVIDER icu LOCALE 'en_US.UTF-8' ICU_LOCALE
> 'en-u-kf-upper' TEMPLATE 'template0';
> CREATE DATABASE dbicu2 LOCALE_PROVIDER icu LOCALE 'en_US.UTF-8' ICU_LOCALE
> 'en-u-kr-latn-digit' TEMPLATE 'template0';
> [...]
> I am not sure this is my personal misunderstanding.
> In the above examples, the first character of column *test_kr*
> is so different that the comparison is based on the first letter.
> If the first letter is the same then compute the second letter..
> So for whatever collation, I should expect 'A 19' to be adjacent with 'A
> 11'?

The query "SELECT test_kr FROM icu  ORDER BY def;"
does not order by test_kr, so the contents of test_kr have no bearing
on the order of the results.

If you order by test_kr, the results look like what you're expecting:

dbicu1=# SELECT test_kr,def FROM icu  ORDER BY test_kr;
 test_kr | def
---------+-----
 1 a     | a
 8 p     | B
 A 11    | b
 A 19    | A
 a 7     | a
 Œ 1     | a

dbicu2=# SELECT test_kr,def FROM icu  ORDER BY test_kr ;
 test_kr | def
---------+-----
 A 11    | b
 A 19    | A
 a 7     | a
 Œ 1     | a
 1 a     | a
 8 p     | B



Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite


--
 I recommend David Deutsch's <<The Beginning of Infinity>>

  Jian


pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Handle infinite recursion in logical replication setup
Next
From: Amit Kapila
Date:
Subject: Re: bogus: logical replication rows/cols combinations