Hello!
I seem to not be able to get PostgreSQL to sort rows by a string column respecting the diacritics.
I read [1] that it's possible to define a custom collation having collation strength "ks" set to "level2", which would mean that it's accent-sensitive.
However, when I try to actually sort using that collation, the order seem to be accent-insensitive.
For example:
CREATE TABLE test (string text);
INSERT INTO test VALUES ('bar'), ('bat'), ('bär');
CREATE COLLATION "und1" (provider = icu, deterministic = false, locale = 'und-u-ks-level1');
CREATE COLLATION "und2" (provider = icu, deterministic = false, locale = 'und-u-ks-level2');
CREATE COLLATION "und3" (provider = icu, deterministic = false, locale = 'und-u-ks-level3');
SELECT * FROM test ORDER BY string collate "und1";
SELECT * FROM test ORDER BY string collate "und2";
SELECT * FROM test ORDER BY string collate "und3";
All three collations give me the same order: bar < bär < bat, although an accent-sensitive order would be bar < bat < bär
This does lose "bär", meaning that those strength levels do have some kind of an effect on "DISTINCT":
SELECT DISTINCT string COLLATE "und1" FROM test;
But it's not working on "ORDER BY".
Do I misunderstand the collation capabilities? Is there a way to actually get an accent-sensitive order?
Also, is there a way to see what options are there for the default built-in collations? I don't see, for example, the used "ks" level in the "pg_collation" table data.
Best regards,
Janis
[1]
https://www.postgresql.org/docs/current/collation.html#ICU-COLLATION-COMPARISON-LEVELS