Re: Explicit deterministic COLLATE fails with pattern matchingoperations on column with non-deterministic collation - Mailing list pgsql-bugs

From James Lucas
Subject Re: Explicit deterministic COLLATE fails with pattern matchingoperations on column with non-deterministic collation
Date
Msg-id CAAFmbbN6iY-g4Of4k5S7t52Cvcvk_bO=SxZOySa6F4pR01eUTw@mail.gmail.com
Whole thread Raw
In response to Re: Explicit deterministic COLLATE fails with pattern matchingoperations on column with non-deterministic collation  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Hi David,

Thanks for the response.  One possibly relevant thing I forgot to
mention.  The collation for the database is "en_US.UTF-8", which is
thus also the collation for the t column of ctest.

Per the documentation, it seems putting an implicit collation on the
operation should work.  Although the documentation is admittedly a
little vague in this respect.  I also found a mail thread in the list
where Peter Eisentraut recommended syntax exactly like this (collate
"C") to work around the inability to use pattern matching on
non-deterministic collation columns.  Unfortunately that thread
trailed out without a response if it actually worked.

Noticed something else a bit interesting.  Perhaps removing indexes
from the equation would also help:

drop index ctestnd_idx01, ctestnd_idx02, ctest_idx01, ctest_idx02;

explain select * from ctest where t like 'a%' collate "C";
                       QUERY PLAN
---------------------------------------------------------
 Seq Scan on ctest  (cost=0.00..1791.00 rows=1 width=10)
   Filter: (t ~~ 'a%'::text COLLATE "C")
COMMENT: Okay

explain select * from ctest where t like 'a%' collate mycollation;
                       QUERY PLAN
---------------------------------------------------------
 Seq Scan on ctest  (cost=0.00..1791.00 rows=1 width=10)
   Filter: (t ~~ 'a%'::text COLLATE mycollation)
COMMENT: Wait, that doesn't seem right.

select * from ctest where t like 'a%' collate mycollation;
ERROR:  nondeterministic collations are not supported for LIKE
COMMENT: So in this case, specifying an explicit non-deterministic
collation with EXPLAIN, we get a plan.  But when we actually go to
execute, it fails.

explain select * from ctestnd where t like 'a%' collate "C";
ERROR:  nondeterministic collations are not supported for LIKE
COMMENT: But in the inverse case, running explain on a column with a
non-deterministic collation, but an explicit deterministic collation,
we don't even get a plan with EXPLAIN.  That seems inconsistent.  Only
conclusion I can reach is that it's failing a check at an earlier
point in the process than in the other case.


Thanks,
James

On Wed, May 27, 2020 at 10:53 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Wed, May 27, 2020 at 8:23 AM James Lucas <jlucasdba@gmail.com> wrote:
>>
>>
>> create table ctestnd (id numeric, t text collate mycollation);
>>
>> create index ctestnd_idx02 on ctestnd (t collate "C");
>
>
>>
>> Test on ctestnd:
>> explain select * from ctestnd where t = 'aAa' collate "C";
>>                                   QUERY PLAN
>> ------------------------------------------------------------------------------
>>  Index Scan using ctestnd_idx02 on ctestnd  (cost=0.42..4.44 rows=1 width=10)
>>    Index Cond: (t = 'aAa'::text COLLATE "C")
>> COMMENT: Works as expected.
>
>
> Uses an index scan which is where the deterministic collation exists
>
>>
>>
>> explain select * from ctestnd where t like 'a%';
>> ERROR:  nondeterministic collations are not supported for LIKE
>> COMMENT: Fails as expected.
>>
>> explain select * from ctestnd where t like 'a%' collate "C";
>> ERROR:  nondeterministic collations are not supported for LIKE
>>
>
> Your schema is inherently unstable in this respect because the planner has to be allowed to choose a sequential scan
andas soon as it does it attempts to perform like comparisons with table data that is stored using a non-deterministic
collation.
>
> I don't know what kinds of promises we make about implicit collation manipulation here but absent such a
transformationthe sequential scan plan with LIKE generates an invalid plan choice.  That it doesn't go find the index
thathappens to have a workable collation for the query is unsurprising - whether that is even a possibility is beyond
me.
>
> David J.
>



pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Explicit deterministic COLLATE fails with pattern matchingoperations on column with non-deterministic collation
Next
From: PG Bug reporting form
Date:
Subject: BUG #16466: Valgrind detects an invalid read in dblink_open() with a cursor inside a transaction