ICU_LOCALE set database default icu collation but not working as intended. - Mailing list pgsql-hackers

From jian he
Subject ICU_LOCALE set database default icu collation but not working as intended.
Date
Msg-id CACJufxHH5HQbnLHcQzNwGZYyKT9Ld1CWf3C6kWEAEuBsY8wq1A@mail.gmail.com
Whole thread Raw
Responses Re: ICU_LOCALE set database default icu collation but not working as intended.
List pgsql-hackers


CREATE DATABASE dbicu1 LOCALE_PROVIDER icu LOCALE 'en_US.UTF-8' ICU_LOCALE 'en-u-kf-upper' TEMPLATE 'template0';
CREATE DATABASE dbicu2 LOCALE_PROVIDER icu LOCALE 'en_US.UTF-8' ICU_LOCALE 'en-u-kr-latn-digit' TEMPLATE 'template0';
--same script apply to dbicu1 dbicu2
BEGIN;
CREATE COLLATION upperfirst (
    provider = icu,
    locale = 'en-u-kf-upper'
);
CREATE TABLE icu (
    def text,
    en text COLLATE "en_US",
    upfirst text COLLATE upperfirst,
    test_kr text
);
INSERT INTO icu
    VALUES ('a', 'a', 'a', '1 a'), ('b', 'b', 'b', 'A 11'), ('A', 'A', 'A', 'A 19'), ('B', 'B', 'B', '8 p');
INSERT INTO icu
    VALUES ('a', 'a', 'a', 'a 7');
INSERT INTO icu
    VALUES ('a', 'a', 'a', 'Œ 1');
COMMIT;
-----------------------
--dbicu1
SELECT def AS def FROM icu ORDER BY def; --since only character. all works fine.
SELECT test_kr FROM icu  ORDER BY def;
/*
  test_kr
---------
 A 19
 1 a
 a 7
 Œ 1
 8 p
 A 11
 */


--dbicu2
SELECT def AS def FROM icu ORDER BY def; --since only character. all works fine.
SELECT test_kr FROM icu  ORDER BY def;
/*
 test_kr
---------
 1 a
 a 7
 Œ 1
 A 19
 A 11
 8 p
(6 rows)
*/

Since dbicu1 and dbicu2 set the default collation then
In dbicu1, I should expect the ordering:
 number >> Upper case alphabet letter  >> lower case alphabet letter>> character Œ (U+0153)

In dbicu2, I should expect the ordering:
 lower case  alphabet letter >> Upper case alphabet letter >> number >> character Œ (U+0153)

As you can see one letter works fine for dbicu1, dbicu2. However, it does not work on more characters.
Or The result is correct, but something I misunderstood?

I am not sure this is my personal misunderstanding.
In the above examples, the first character of column test_kr
is so different that the comparison is based on the first letter.
If the first letter is the same then compute the second letter..
So for whatever collation, I should expect 'A 19' to be adjacent with 'A 11'?



--
 I recommend David Deutsch's <<The Beginning of Infinity>>

  Jian


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: pg_upgrade test writes to source directory
Next
From: Justin Pryzby
Date:
Subject: Re: Remove support for Visual Studio 2013