case insensitive collation of Greek's sigma - Mailing list pgsql-general

From Jakub Jedelsky
Subject case insensitive collation of Greek's sigma
Date
Msg-id CAC1JxDQhi_M4WO4e19qPmcOGpN6NKJz60zsdXMuCHwVuE3_Ldw@mail.gmail.com
Whole thread Raw
Responses Re: case insensitive collation of Greek's sigma  (Achilleas Mantzios <achill@matrix.gatewaynet.com>)
Re: case insensitive collation of Greek's sigma  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: case insensitive collation of Greek's sigma  (Frank Limpert <frank_limpert@yahoo.com>)
Re: case insensitive collation of Greek's sigma  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
List pgsql-general
Hello,

during our tests of Postgres with ICU we found an issue with ILIKE of upper and lowercase sigma (Σ). The letter has two lowercase variants σ and ς (at the end of a word). I'm working with en_US and en-US-x-icu collations and results are a bit unexpected - they are inverted:

postgres=# SELECT
postgres-# 'ΣΣ' ILIKE 'σσ' COLLATE "en_US",
postgres-# 'ΣΣ' ILIKE 'σς' COLLATE "en_US"
postgres-# ;
 ?column? | ?column?
----------+----------
 t        | f
(1 row)

postgres=# SELECT
postgres-# 'ΣΣ' ILIKE 'σσ' COLLATE "en-US-x-icu",
postgres-# 'ΣΣ' ILIKE 'σς' COLLATE "en-US-x-icu";
 ?column? | ?column?
----------+----------
 f        | t
(1 row)

I run those commands on the latest (14.1) official docker image.

Is it possible to unify the behaviour?And which one is correct from the community point of view?

If I could start, I think both results are wrong as both should return True. If I got it right, in the background there is a lower() function running to compare strings, which is not enough for such cases (until the left side isn't taken as a standalone word).

Thanks,

- jj

pgsql-general by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Best examples of cardinality check and associated functions
Next
From: Achilleas Mantzios
Date:
Subject: Re: case insensitive collation of Greek's sigma