Re: Wrong results from inner-unique joins caused by collation mismatch - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Wrong results from inner-unique joins caused by collation mismatch
Date
Msg-id 1613472.1777042397@sss.pgh.pa.us
Whole thread
In response to Wrong results from inner-unique joins caused by collation mismatch  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Wrong results from inner-unique joins caused by collation mismatch
List pgsql-hackers
Richard Guo <guofenglinux@gmail.com> writes:
> My first thought was to fix this by:

> +  if (!IndexCollMatchesExprColl(ind->indexcollations[c],
> +                                exprInputCollation((Node *) rinfo->clause)))
> +      continue;

> However, this caused an unexpected plan diff in join.out where a
> left-join removal over (name, text) stopped working, because name and
> text use different collations.  So this check is too strict: a
> mismatch between two deterministic collations should be OK for
> uniqueness proof, as a deterministic collation treats two strings as
> equal iff they are byte-wise equal (see CREATE COLLATION).

Yes, we'd be taking a serious performance hit if we insisted on
exact collation matches for this purpose.  I agree that disallowing
non-matching non-deterministic collations is the right fix.

> Hence, I got attached patch.  Thoughts?

I don't love doing it like this, for two reasons:

1. I think there are other places in the planner that will need
substantially this same logic.  I recommend breaking out a
subroutine defined more or less as "do these collations have
equivalent notions of equality".

2. I find the test next to unreadable as written --- for example,
it's more difficult than it should be to figure out what happens
if one collation is deterministic and the other not.  Using a
subroutine would help here by letting you break down the test
into multiple steps.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: Fix DROP PROPERTY GRAPH "unsupported object class" error
Next
From: Richard Guo
Date:
Subject: Re: Wrong results from inner-unique joins caused by collation mismatch