Re: Explicit deterministic COLLATE fails with pattern matchingoperations on column with non-deterministic collation - Mailing list pgsql-bugs

From James Lucas
Subject Re: Explicit deterministic COLLATE fails with pattern matchingoperations on column with non-deterministic collation
Date
Msg-id CAAFmbbOez_LbhD767pPH0LGdEF3=nP9CdtmU+7gYxF_f-_Ce0w@mail.gmail.com
Whole thread Raw
In response to Re: Explicit deterministic COLLATE fails with pattern matchingoperations on column with non-deterministic collation  ("Daniel Verite" <daniel@manitou-mail.org>)
Responses Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Apologies if anyone gets this twice.  I got a rejected mail notice
back the first time I sent.

You are correct.  I was playing around with collation naming sometime
back and when I started looking at this, I just used one I had left in
the database assuming it was correct.  That's my bad.

I dropped the tables and redefined the collation as
create collation mycollation (provider = icu, locale =
'en-US-u-ks-level2', deterministic = false);

Now the results are more what I expected.

select schemaname, tablename, attname, n_distinct, most_common_vals
from pg_stats where attname='t' and tablename like 'stest%' order by
tablename;
 schemaname | tablename | attname | n_distinct |     most_common_vals
------------+-----------+---------+------------+---------------------------
 public     | stest     | t       |          6 | {aaa,cCc,bBb,bbb,ccc,aAa}
 public     | stestnd   | t       |          3 | {ccc,bbb,aaa}

So that is something to be aware of - the collation defined on the
column can impact stats values, which could in turn impact plans
chosen for queries that use alternative collations.

Sorry for the distraction.  That still leaves us with the original
issue regarding LIKE and COLLATE.

Thanks,
James

On Thu, May 28, 2020 at 1:48 PM Daniel Verite <daniel@manitou-mail.org> wrote:
>
>         Tom Lane wrote:
>
> > I suspect that the 'en-US-ks-level2' ICU locale doesn't act as you
> > think it does.
>
> Indeed, because the syntax is tricky. The OP wants 'en-US-u-ks-level2'.
> With 'en-US-ks-level2', the ks-level2 component is ignored and you
> get a tertiary colstrength.
>
> Or use 'en-US@colStrength=secondary' which is possibly more
> readable and works with older versions of ICU.
>
>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite



pgsql-bugs by date:

Previous
From: "Daniel Verite"
Date:
Subject: Re: Explicit deterministic COLLATE fails with pattern matchingoperations on column with non-deterministic collation
Next
From: Tom Lane
Date:
Subject: Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation