Home > mailing lists

Re: A problem about partitionwise join - Mailing list pgsql-hackers

From	Richard Guo
Subject	Re: A problem about partitionwise join
Date	May 3, 2024 11:47:45
Msg-id	CAMbWs49fL1ZscsR-6=8bFAfLjt9=TMmpMv1ZHTUrHASAgxU7qQ@mail.gmail.com Whole thread
In response to	Re: A problem about partitionwise join (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: A problem about partitionwise join
List	pgsql-hackers

Tree view

On Wed, May 1, 2024 at 1:31 AM Robert Haas <robertmhaas@gmail.com> wrote:

I think it's slightly questionable whether this patch is worthwhile.
The case memorialized in the regression tests, t1.a = t2.a AND t1.a =
t2.b, is a very weird thing to do. The case mentioned in the original
email, foo.k1 = bar.k1 and foo.k2 = bar.k2 and foo.k2 = 16, seems like
something that could realistically happen, especially when there are
views in use (e.g. the view joins foo and bar, and then someone
queries the view for one of the join columns). In such a case, it's
possible that foo.k2 = 16 is selective enough that we really don't
care about partition-wise join any more, but it's also possible that
it's not too selective and we do care about partition-wise join. So I
don't think that the case that the patch fixes is something that can
ever happen, but I do think it's probably fairly rare that brings any
benefit, which is why I thought that EC-based matching was an OK
approach to this problem initially. Perhaps that was the wrong idea,
though.

Thank you for taking the time to review this patch! I think Ashutosh
also mentioned that the new added test case looks artificial. I must
admit that I'm not too sure how common we encounter queries with
partially-redundant join clauses in real-life scenarios. It is possible
that such cases are quite rare, and this patch will then not be of much
use.

I initially brought up this issue because I noticed an inconsistency
regarding the generation of a partition-wise join: with 't1.k = t2.k and
t1.k = t2.val' we are able to generate a partition-wise join, while its
equivalent form 't1.k = t2.val and t1.k = t2.k' does not result in a
partition-wise join. I think this inconsistency could be confusing.

The reason behind this is that with 't1.k = t2.val and t1.k = t2.k' it
happens to constrain other members (t1.k and t2.val) of the EC than the
ones we are looking for (t1.k and t2.k). Our current code looks through
the join's restriction clauses for matched keys. In addition to that,
this patch checks to see if any unmatched keys are known equal by ECs,
leveraging function exprs_known_equal().

Does the additional logic added by this patch have a noticeable
performance cost?

I think one concern regarding performance cost is that the function
exprs_known_equal() would be called O(N^2) times, where N is the number
of partition key expressions. But I think this might not be a problem.
The number of a joinrel's partition key expressions would only be equal
to the join degree, since each base relation within the join contributes
only one partition key expression, according to
set_joinrel_partition_key_exprs(). This number would not scale with the
number of partitions. But I have not measured the performance in
practice by running benchmarks. Maybe I'm just wrong.

Thanks
Richard

pgsql-hackers by date:

From: Ashutosh Bapat
Date: 03 May 2024, 09:32:52
Subject: Re: tablecmds.c/MergeAttributes() cleanup

From: Peter Eisentraut
Date: 03 May 2024, 11:48:20
Subject: Re: wrong comment in libpq.h

Re: A problem about partitionwise join - Mailing list pgsql-hackers

Previous

Next