Re: Nondeterministic collations vs. text_pattern_ops - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: Nondeterministic collations vs. text_pattern_ops
Date
Msg-id 57e4d38e-eb2c-9a77-5a7c-b94396187cf9@2ndquadrant.com
Whole thread Raw
In response to Nondeterministic collations vs. text_pattern_ops  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Nondeterministic collations vs. text_pattern_ops
List pgsql-hackers
On 2019-09-17 01:13, Tom Lane wrote:
> Whilst poking at the leakproofness-of-texteq issue, I realized
> that there's an independent problem caused by the nondeterminism
> patch.  To wit, that the text_pattern_ops btree opclass uses
> texteq as its equality operator, even though that operator is
> no longer guaranteed to be bitwise equality.  That means that
> depending on which collation happens to get attached to the
> operator, equality might be inconsistent with the other members
> of the opclass, leading to who-knows-what bad results.

You can't create a text_pattern_ops index on a column with
nondeterministic collation:

create collation c1 (provider = icu, locale = 'und', deterministic = false);
create table t1 (a int, b text collate c1);
create index on t1 (b text_pattern_ops);
ERROR:  nondeterministic collations are not supported for operator class
"text_pattern_ops"

There is some discussion in internal_text_pattern_compare().

Are there other cases we need to consider?

I notice that there is a hash opclass text_pattern_ops, which I'd
actually never heard of until now, and I don't see documented.  What
would we need to do about that?

> The obvious fix for this is to invent separate new equality operators,
> but that's actually rather disastrous for performance, because
> text_pattern_ops indexes would no longer be able to use WHERE clauses
> using plain equality.  That also feeds into whether equality clauses
> deduced from equivalence classes will work for them (nope, not any
> more).  People using such indexes are just about certain to be
> bitterly unhappy.

Would it help if one created COLLATE "C" indexes instead of
text_pattern_ops?  What are the tradeoffs between the two approaches?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Feature request: binary NOTIFY
Next
From: Richard Guo
Date:
Subject: Re: Pulling up direct-correlated ANY_SUBLINK