Nondeterministic collations vs. text_pattern_ops - Mailing list pgsql-hackers

From Tom Lane
Subject Nondeterministic collations vs. text_pattern_ops
Date
Msg-id 22566.1568675619@sss.pgh.pa.us
Whole thread Raw
Responses Re: Nondeterministic collations vs. text_pattern_ops
List pgsql-hackers
Whilst poking at the leakproofness-of-texteq issue, I realized
that there's an independent problem caused by the nondeterminism
patch.  To wit, that the text_pattern_ops btree opclass uses
texteq as its equality operator, even though that operator is
no longer guaranteed to be bitwise equality.  That means that
depending on which collation happens to get attached to the
operator, equality might be inconsistent with the other members
of the opclass, leading to who-knows-what bad results.

bpchar_pattern_ops has the same issue with respect to bpchareq.

The obvious fix for this is to invent separate new equality operators,
but that's actually rather disastrous for performance, because
text_pattern_ops indexes would no longer be able to use WHERE clauses
using plain equality.  That also feeds into whether equality clauses
deduced from equivalence classes will work for them (nope, not any
more).  People using such indexes are just about certain to be
bitterly unhappy.

We may not have any choice but to do that, though --- I sure don't
see any other easy fix.  If we could be certain that the collation
attached to the operator is deterministic, then it would still work
with a pattern_ops index, but that's not a concept that the index
infrastructure has got right now.

Whatever we do about this is likely to require a catversion bump,
meaning we've got to fix it *now*.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Chapman Flack
Date:
Subject: Re: Define jsonpath functions as stable
Next
From: Nikita Glukhov
Date:
Subject: Re: SQL/JSON: JSON_TABLE