Re: Parallel Append can break run-time partition pruning - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Parallel Append can break run-time partition pruning
Date
Msg-id CA+HiwqGDqMR4DOwJt4HjpnnSGr2cOS1E-LS_5Q9-1X62Ur=EEA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Append can break run-time partition pruning  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
Hi David,

On Tue, Apr 21, 2020 at 12:03 PM David Rowley <dgrowleyml@gmail.com> wrote:
> On Fri, 17 Apr 2020 at 19:08, Amit Langote <amitlangote09@gmail.com> wrote:
> > On Wed, Apr 15, 2020 at 4:18 PM David Rowley <dgrowleyml@gmail.com> wrote:
> > > I'm a bit divided on what the correct fix is.  If I blame Parallel
> > > Append for not trying hard enough to pull up the lower Append in
> > > accumulate_append_subpath(), then clearly the parallel append code is
> > > to blame.
> >
> > I spent some time trying to understand how Append parallelism works
> > and I am tempted to agree with you that there might be problems with
> > how accumulate_append_subpath()'s interacts with parallelism. Maybe it
> > would be better to disregard a non-parallel-aware partial Append if it
> > requires us to fail on flattening a child Append.  I have as attached
> > a PoC fix to show that.  While a nested Append is not really a problem
> > in general, it appears to me that our run-time code is not in position
> > to work correctly with them, or at least not with how things stand
> > today...
>
> Thanks for taking a look at this. I've now looked at this in more
> detail and below is my understanding of what's going on:
>
> It seems, in this case, what's going on is, on the following line:
>
> accumulate_append_subpath(cheapest_partial_path,
>   &partial_subpaths, NULL);
>
> we don't manage to pullup the sub-Append due to passing a NULL pointer
> for the final special_subpaths argument.  This results in just taking
> the child's Append path verbatim. i.e. nested Append
>
> Later, when we do:
>
> else if (nppath == NULL ||
> (cheapest_partial_path != NULL &&
>   cheapest_partial_path->total_cost < nppath->total_cost))
> {
> /* Partial path is cheaper or the only option. */
> Assert(cheapest_partial_path != NULL);
> accumulate_append_subpath(cheapest_partial_path,
>   &pa_partial_subpaths,
>   &pa_nonpartial_subpaths);
>
> we do pass a non-NULL special_subpaths argument to allow the
> sub-Append to be pulled up.
>
> So, now we have 2 paths, one with a nested Append and one with a
> flattened Append.  Both paths have the same cost, but due to the fact
> that we call add_partial_path() for the nested Append version first,
> the logic in add_partial_path() accepts that path. However, the
> subsequent call of add_partial_path(), the one for the non-nested
> Append, that path is rejected due to the total cost being too similar
> to one of the existing partial path. We just end up keeping the nested
> Append as the cheapest partial path... That path, since in the example
> case only has a single subpath, is pulled up into the main append by
> the logic added in 8edd0e794.
>
> I think you've realised this and that's why your PoC patch just
> rejected the first path when it's unable to do the pullup.

Right.

> We'll get a
> better path later when we allow mixed partial and non-partial paths.

Yes, but only if parallel-aware Append is allowed (pa_subpaths_valid).
So it's possible that the Append may not participate in any
parallelism whatsoever if we reject partial Append on failing to fold
a child Append, which does somewhat suck.

> (We'll never fail to do a pullup when calling
> accumulate_append_subpath() for "nppath", since that's a non-parallel
> path and accumulate_append_subpath() will always pull Append paths up
> when they're not parallel aware.)
>
> I wonder if the fix should be more something along the lines of trying
> to merge things do we only generate a single partial path.  That way
> we wouldn't be at the mercy of the logic in add_partial_path() to
> accept or reject the path based on the order the paths are added.

So as things stand, parallel-aware partial Append (Parallel Append)
path competes with non-parallel partial Append path on cost grounds.
As far as I can see, it's only the latter that can contain among its
subpaths an (nested) Append which can be problematic.  Given that, out
choice between the two types of partial Append paths becomes based on
something that is not cost, but is that okay?

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: It is not documented that pg_promote can exit standby mode
Next
From: Thomas Munro
Date:
Subject: Re: WIP: WAL prefetch (another approach)