Re: Parallel Append can break run-time partition pruning - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Parallel Append can break run-time partition pruning |
Date | |
Msg-id | CAApHDvq5BLXmXLFxDbmYUagmtYfVNFc-YSkVZbhuXOcM+--+Kw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Append can break run-time partition pruning (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: Parallel Append can break run-time partition pruning
Re: Parallel Append can break run-time partition pruning |
List | pgsql-hackers |
On Fri, 17 Apr 2020 at 19:08, Amit Langote <amitlangote09@gmail.com> wrote: > > On Wed, Apr 15, 2020 at 4:18 PM David Rowley <dgrowleyml@gmail.com> wrote: > > I'm a bit divided on what the correct fix is. If I blame Parallel > > Append for not trying hard enough to pull up the lower Append in > > accumulate_append_subpath(), then clearly the parallel append code is > > to blame. > > I spent some time trying to understand how Append parallelism works > and I am tempted to agree with you that there might be problems with > how accumulate_append_subpath()'s interacts with parallelism. Maybe it > would be better to disregard a non-parallel-aware partial Append if it > requires us to fail on flattening a child Append. I have as attached > a PoC fix to show that. While a nested Append is not really a problem > in general, it appears to me that our run-time code is not in position > to work correctly with them, or at least not with how things stand > today... Thanks for taking a look at this. I've now looked at this in more detail and below is my understanding of what's going on: It seems, in this case, what's going on is, on the following line: accumulate_append_subpath(cheapest_partial_path, &partial_subpaths, NULL); we don't manage to pullup the sub-Append due to passing a NULL pointer for the final special_subpaths argument. This results in just taking the child's Append path verbatim. i.e. nested Append Later, when we do: else if (nppath == NULL || (cheapest_partial_path != NULL && cheapest_partial_path->total_cost < nppath->total_cost)) { /* Partial path is cheaper or the only option. */ Assert(cheapest_partial_path != NULL); accumulate_append_subpath(cheapest_partial_path, &pa_partial_subpaths, &pa_nonpartial_subpaths); we do pass a non-NULL special_subpaths argument to allow the sub-Append to be pulled up. So, now we have 2 paths, one with a nested Append and one with a flattened Append. Both paths have the same cost, but due to the fact that we call add_partial_path() for the nested Append version first, the logic in add_partial_path() accepts that path. However, the subsequent call of add_partial_path(), the one for the non-nested Append, that path is rejected due to the total cost being too similar to one of the existing partial path. We just end up keeping the nested Append as the cheapest partial path... That path, since in the example case only has a single subpath, is pulled up into the main append by the logic added in 8edd0e794. I think you've realised this and that's why your PoC patch just rejected the first path when it's unable to do the pullup. We'll get a better path later when we allow mixed partial and non-partial paths. (We'll never fail to do a pullup when calling accumulate_append_subpath() for "nppath", since that's a non-parallel path and accumulate_append_subpath() will always pull Append paths up when they're not parallel aware.) I wonder if the fix should be more something along the lines of trying to merge things do we only generate a single partial path. That way we wouldn't be at the mercy of the logic in add_partial_path() to accept or reject the path based on the order the paths are added. David
pgsql-hackers by date: