Re: apply_scanjoin_target_to_paths and partitionwise join - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: apply_scanjoin_target_to_paths and partitionwise join |
Date | |
Msg-id | CAExHW5uw6KJ0RZ4DqdCi=iJPFU90wSLr3Qa-AtH-5MshfM5C4A@mail.gmail.com Whole thread Raw |
In response to | Re: apply_scanjoin_target_to_paths and partitionwise join (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Fri, Jan 3, 2025 at 3:02 AM Robert Haas <robertmhaas@gmail.com> wrote: > > I think it's unhelpful that you keep calling this a "bug" when the > behavior is clearly deliberate. Whether it's the *right* behavior is > debatable, but it's not *accidental* behavior. > Ok, let's call it "not right" behaviour :). Let me further expand on the explanation in my first email [1]. After the planner has added all possible paths, apply_scanjoin_target_to_paths(), which should be just adjusting their targets, zaps them all. That looks weird. But the comment explains why its doing so. -- quote comment * This function would still be correct if we kept the * existing paths: we'd modify them to generate the correct target above * the partitioning Append, and then they'd compete on cost with paths * generating the target below the Append. However, in our current cost * model the latter way is always the same or cheaper cost, so modifying * the existing paths would just be useless work. Moreover, when the cost * is the same, varying roundoff errors might sometimes allow an existing * path to be picked, resulting in undesirable cross-platform plan * variations. -- The comment mentions only "partitioning Append" and not "Join" paths. A simple partitioned relation's pathlist contains only append paths but a partitioned join relation's pathlist contains join paths as well as append (of join) paths. What the comment says is true for both Append of scans as well as Append of joins, but not for join paths joining append paths - non-partition wise paths. In such paths we should be adjusting only the target of the join path and not that of the paths under the Append. The comment does not mention Join over append at all. The code discards both join as well as append paths but rebuilds only append paths. There is no comment in the function explaining why we omit join of append paths. So the code does not seem intentional to me as far as partitionwise joins are concerned. Losing a costwise optimal path doesn't seem to be something that should happen while adjusting path targets (unless while adjusting path targets we find a lower cost path, but we aren't keeping the old paths around for comparison.) > On Thu, Jan 2, 2025 at 3:58 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I am wondering if the problem is not that the plan is slower, it's > > that for some reason the planner took a lot longer to create it. > > It's very plausible that partitionwise planning takes longer, and > > maybe we have some corner cases where the time is O(N^2) or worse. > > That doesn't seem like a totally unreasonable speculation, but it > seems a little surprising that retaining the non-partitionwise paths > would fix it. True, that might let us discard a bunch of partitionwise > paths more quickly than would otherwise be possible, but I wouldn't > expect that to have an impact as dramatic as what Jakub alleged. The > thing I thought about was whether there might be some weird effects > with lots of empty partitions; or maybe with some other property of > the path like say sort keys or parallelism. For example if we couldn't > generate a partitionwise path with sort keys as good as the > non-partitionwise path had, or if we couldn't generate a parallel > partitionwise path but we could generate a parallel non-partitionwise > path. As far as I knew neither of those things are real problems, but > if they were then I believe they could pretty easily explain a large > regression. > > > However, this is pure speculation without a test case, and any > > proposed fix would be even more speculative. I concur with your > > bottom line: we should insist on a public test case before deciding > > what to do about it. > That's a valid ask. AFAIR, it's a quite tricky scenario. Both Jakub and myself have tried but could not reproduce the issue. Let me mark this CF entry as returned with feedback and resurrect it when we have a reproduction. [1] https://www.postgresql.org/message-id/CAExHW5toze58+jL-454J3ty11sqJyU13Sz5rJPQZDmASwZgWiA@mail.gmail.com -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: