Re: apply_scanjoin_target_to_paths and partitionwise join - Mailing list pgsql-hackers

From Richard Guo
Subject Re: apply_scanjoin_target_to_paths and partitionwise join
Date
Msg-id CAMbWs4_o2B+Yqf0EU-PJrWbiXf_d-XrB1Gu7wMa89eTitCCExw@mail.gmail.com
Whole thread Raw
In response to Re: apply_scanjoin_target_to_paths and partitionwise join  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Oct 29, 2025 at 5:17 AM Robert Haas <robertmhaas@gmail.com> wrote:
> What we see here is that, in the case of Q1, the fix reduces the cost
> by a large amount, which is the kind of thing you'd hope would happen
> when you fix a costing a bug, although I haven't quite figured out why
> we get such a large benefit. Likewise, in the second case, the cost
> goes down with the fix, although not by a lot. That case is
> interesting because the plan selected with the patch is a merge join
> of appends of index scans, which of every possible plan shape is
> probably the one that benefits least from being performed
> partitionwise. If the merge join involved a sort, you'd expect the
> partitionwise approach to win, since several smaller sorts figure to
> cost less in total than one big sort; but here it doesn't, so there's
> little room for the partitionwise nature of the operation to provide a
> benefit, and apparently the planner thinks that, in fact, it doesn't.
> But the Q3 change is really the most disturbing part of this -- the
> cost actually goes up with the fix. I haven't figured out whether
> that's due to some kind of round-off error or whether it's evidence
> that the patch doesn't properly fix the bug. I wonder whether
> Richard's rewrite of unique-ification requires some adjustment to the
> patch.

I don't think the rewrite of unique-ification requires any adjustment
to this patch.  I ran Q1 on v18, which does not include the
unique-ification changes, and here is what I observed: without
Ashutosh's patch, it performs a full partitionwise join; with the
patch, it performs one join partitionwise and the other
non-partitionwise.  The costs of the unpatched versus patched versions
on v18 are 2286.11 and 1420.40, respectively, indicating that
Ashutosh's patch reduces the cost by a large amount.  This matches
your observation exactly.  I think this suggests that we can rule out
the interference from the unique-ification changes.

The comment explaining why apply_scanjoin_target_to_paths() throws
away all existing paths claims that:

 * If the rel is partitioned, we want to drop its existing paths and
 * generate new ones.  This function would still be correct if we kept the
 * existing paths: we'd modify them to generate the correct target above
 * the partitioning Append, and then they'd compete on cost with paths
 * generating the target below the Append.  However, in our current cost
 * model the latter way is always the same or cheaper cost, so modifying
 * the existing paths would just be useless work.

However, that reasoning is valid only when all of the existing paths
are Appends of Scans or Joins.  It does not hold for a partitioned
join relation, which can have paths that are Joins of Appends.
Therefore, I think there's something wrong with the current logic, and
we may need to do something about it.

IIUC, Ashutosh's patch avoids discarding existing paths for
partitioned join relations, so that we can retain non-partitionwise
paths and ensure we don't miss the cheapest path if it happens to be
among them.

One of my concerns with this approach is that, for a partitionwise
join path in the existing paths, we would end up with two paths after
apply_scanjoin_target_to_paths(): one with the scan/join target
applied above the Append, and one below it.  As the aforementioned
comment explains, these two paths tend to have the same cost,
resulting in redundant work and potentially causing cross-platform
plan variations.

Maybe we could address this by discarding all existing partitionwise
paths and relying on apply_scanjoin_target_to_paths() to rebuild these
Append paths after applying the target to all partitions?

Another concern I have, though I'm not entirely sure, is about
comparing the costs between a partitionwise join path and a
non-partitionwise join path.  It seems to me that their costs are
computed in very different ways, so I'm not sure whether the costs are
truly comparable.  So I suspect that, with the patch, there may be
cases where a lower estimated cost does not necessarily translate to
shorter execution time.  However, I'm not sure what to do about this.

- Richard



pgsql-hackers by date:

Previous
From: Chao Li
Date:
Subject: Re: Should we say "wal_level = logical" instead of "wal_level >= logical"
Next
From: Peter Eisentraut
Date:
Subject: Re: libpq OpenSSL and multithreading