On 2019-09-17 01:13, Tom Lane wrote:
> Whilst poking at the leakproofness-of-texteq issue, I realized
> that there's an independent problem caused by the nondeterminism
> patch. To wit, that the text_pattern_ops btree opclass uses
> texteq as its equality operator, even though that operator is
> no longer guaranteed to be bitwise equality. That means that
> depending on which collation happens to get attached to the
> operator, equality might be inconsistent with the other members
> of the opclass, leading to who-knows-what bad results.
You can't create a text_pattern_ops index on a column with
nondeterministic collation:
create collation c1 (provider = icu, locale = 'und', deterministic = false);
create table t1 (a int, b text collate c1);
create index on t1 (b text_pattern_ops);
ERROR: nondeterministic collations are not supported for operator class
"text_pattern_ops"
There is some discussion in internal_text_pattern_compare().
Are there other cases we need to consider?
I notice that there is a hash opclass text_pattern_ops, which I'd
actually never heard of until now, and I don't see documented. What
would we need to do about that?
> The obvious fix for this is to invent separate new equality operators,
> but that's actually rather disastrous for performance, because
> text_pattern_ops indexes would no longer be able to use WHERE clauses
> using plain equality. That also feeds into whether equality clauses
> deduced from equivalence classes will work for them (nope, not any
> more). People using such indexes are just about certain to be
> bitterly unhappy.
Would it help if one created COLLATE "C" indexes instead of
text_pattern_ops? What are the tradeoffs between the two approaches?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services