Thread: Pushdown target list below gather node (WAS Re: WIP: Upper planner pathification)

On Wed, Mar 9, 2016 at 11:58 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Mar 9, 2016 at 12:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Gather is a bit weird, because although it can project (and needs to,
> > per the example of needing to compute a non-parallel-safe function),
> > you would rather push down as much work as possible to the child node;
> > and doing so is semantically OK for parallel-safe functions.  (Pushing
> > functions down past a Sort node, for a counterexample, is not so OK
> > if you are concerned about function evaluation order, or even number
> > of executions.)
> >
> > In the current code structure it would perhaps be reasonable to teach
> > apply_projection_to_path about that --- although this would require
> > logic to separate parallel-safe and non-parallel-safe subexpressions,
> > which doesn't quite seem like something apply_projection_to_path
> > should be doing.
>
> I think for v1 it would be fine to make this all-or-nothing; that's
> what I had in mind to do.  That is, if the entire tlist is
> parallel-safe, push it all down.  If not, let the workers just return
> the necessary Vars and have Gather compute the final tlist.
>

I find it quite convenient to teach apply_projection_to_path() to push down target-list beneath Gather node, when targetlist contains parallel-safe expression.  Attached patch implements pushing targetlist beneath gather node.

Below is output of a simple test which shows the effect of implementation.

Without Patch -
------------------------
postgres=# explain verbose select c1+2 from t1 where c1<10;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Gather  (cost=0.00..44420.43 rows=30 width=4)
   Output: (c1 + 2)
   Number of Workers: 2
   ->  Parallel Seq Scan on public.t1  (cost=0.00..44420.35 rows=13 width=4)
         Output: c1
         Filter: (t1.c1 < 10)
(6 rows)

With Patch -
-----------------------
postgres=# explain verbose select c1+2 from t1 where c1<10;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Gather  (cost=0.00..45063.75 rows=30 width=4)
   Output: ((c1 + 2))
   Number of Workers: 1
   ->  Parallel Seq Scan on public.t1  (cost=0.00..45063.68 rows=18 width=4)
         Output: (c1 + 2)
         Filter: (t1.c1 < 10)
(6 rows)


In the above plans, you can notice that target list expression (c1 + 2) is pushed beneath Gather node after patch.

Thoughts?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment
On Wed, Mar 16, 2016 at 3:09 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 9, 2016 at 11:58 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Mar 9, 2016 at 12:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Gather is a bit weird, because although it can project (and needs to,
> > per the example of needing to compute a non-parallel-safe function),
> > you would rather push down as much work as possible to the child node;
> > and doing so is semantically OK for parallel-safe functions.  (Pushing
> > functions down past a Sort node, for a counterexample, is not so OK
> > if you are concerned about function evaluation order, or even number
> > of executions.)
> >
> > In the current code structure it would perhaps be reasonable to teach
> > apply_projection_to_path about that --- although this would require
> > logic to separate parallel-safe and non-parallel-safe subexpressions,
> > which doesn't quite seem like something apply_projection_to_path
> > should be doing.
>
> I think for v1 it would be fine to make this all-or-nothing; that's
> what I had in mind to do.  That is, if the entire tlist is
> parallel-safe, push it all down.  If not, let the workers just return
> the necessary Vars and have Gather compute the final tlist.
>

I find it quite convenient to teach apply_projection_to_path() to push down target-list beneath Gather node, when targetlist contains parallel-safe expression.  Attached patch implements pushing targetlist beneath gather node.

That doesn't update the cost of the subpath, which it probably needs to do.  I wonder if this shouldn't be implemented by recursing.

if (IsA(path, GatherPath) && !has_parallel_hazard((Node *) target->exprs, false))
    apply_projection_to_path(root, something, ((GatherPath *) path)->subpath, target);

Tom, any comments?  I think it would be smart to push this into 9.6.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
> That doesn't update the cost of the subpath, which it probably needs to
> do.  I wonder if this shouldn't be implemented by recursing.

> if (IsA(path, GatherPath) && !has_parallel_hazard((Node *) target->exprs,
> false))
>     apply_projection_to_path(root, something, ((GatherPath *)
> path)->subpath, target);

> Tom, any comments?  I think it would be smart to push this into 9.6.

I agree with the idea that this problem should be solved, but there's
nothing much to like about the particular details here.  If we're going
to push down stuff into the workers, I think adding a Result if needed
to do that is something we'd definitely want it to do.  So more
along the lines of
if (IsA(path, GatherPath) &&    !has_parallel_hazard((Node *) target->exprs, false)){    GatherPath *gpath =
(GatherPath*) path;
 
    gpath->subpath =        apply_projection_to_path(root,                     gpath->subpath->parent,
  gpath->subpath,                     target);}
 

However, I continue to doubt that apply_projection_to_path really ought to
know about parallel safety.

And there is a larger problem with this: I'm not sure that it's
appropriate for apply_projection_to_path to assume that the subpath is not
shared with any other purposes.  If it is shared, and we update the
subpath's target in-place, we just broke the other path chains.

Now the existing uses of apply_projection_to_path don't have to worry
about that because they're always messing with paths whose parent rel
isn't going to be used in any other way.  Maybe this one doesn't either
because the subpath would always be a partial path that won't have any
other potential application besides being a child of *this specific*
Gather path.  But I'd like to see that addressed in a comment, if so.

If it's not so, maybe the answer is to always interpose a Result node,
in which case just use create_projection_path() in the above fragment.
        regards, tom lane



On Wed, Mar 16, 2016 at 12:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > That doesn't update the cost of the subpath, which it probably needs to
> > do.  I wonder if this shouldn't be implemented by recursing.
>
> > if (IsA(path, GatherPath) && !has_parallel_hazard((Node *) target->exprs,
> > false))
> >     apply_projection_to_path(root, something, ((GatherPath *)
> > path)->subpath, target);
>
> > Tom, any comments?  I think it would be smart to push this into 9.6.
>
> I agree with the idea that this problem should be solved, but there's
> nothing much to like about the particular details here.  If we're going
> to push down stuff into the workers, I think adding a Result if needed
> to do that is something we'd definitely want it to do.  So more
> along the lines of
>
>         if (IsA(path, GatherPath) &&
>             !has_parallel_hazard((Node *) target->exprs, false))
>         {
>                 GatherPath *gpath = (GatherPath *) path;
>
>                 gpath->subpath =
>                         apply_projection_to_path(root,
>                                                  gpath->subpath->parent,
>                                                  gpath->subpath,
>                                                  target);
>         }

I agree.  That's a cleaned-up version of the formula I posted.

> However, I continue to doubt that apply_projection_to_path really ought to
> know about parallel safety.
>
> And there is a larger problem with this: I'm not sure that it's
> appropriate for apply_projection_to_path to assume that the subpath is not
> shared with any other purposes.  If it is shared, and we update the
> subpath's target in-place, we just broke the other path chains.

That's true.  I don't see an obvious hazard here, because the Gather's
child came from the rel's partial_pathlist, and the only way it gets
used from there is to stick the Gather on top of it.  So it really
can't show up anywhere else.  I think.  But I agree it's a little
scary.  (To some lesser extent, apply_projection_to_path is always
scary like that.)

One idea is to skip generate_gather_paths() for the final rel and then
make up the difference later: apply the final rel's target list once
we've computed it, and then generate a gather path based on that.  But
I don't see an obvious way of doing that.

> Now the existing uses of apply_projection_to_path don't have to worry
> about that because they're always messing with paths whose parent rel
> isn't going to be used in any other way.  Maybe this one doesn't either
> because the subpath would always be a partial path that won't have any
> other potential application besides being a child of *this specific*
> Gather path.  But I'd like to see that addressed in a comment, if so.
>
> If it's not so, maybe the answer is to always interpose a Result node,
> in which case just use create_projection_path() in the above fragment.

Mmmph.  That seems like a 2-bit solution, but I guess it would work.
What if we taught create_projection_plan() to elide the Result node in
that case?  That is, instead of this:
   if (tlist_same_exprs(tlist, subplan->targetlist))

We could do this:
   if (is_projection_capable_path(subplan) || tlist_same_exprs(tlist,
subplan->targetlist))

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Mar 16, 2016 at 12:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> And there is a larger problem with this: I'm not sure that it's
>> appropriate for apply_projection_to_path to assume that the subpath is not
>> shared with any other purposes.  If it is shared, and we update the
>> subpath's target in-place, we just broke the other path chains.

> That's true.  I don't see an obvious hazard here, because the Gather's
> child came from the rel's partial_pathlist, and the only way it gets
> used from there is to stick the Gather on top of it.  So it really
> can't show up anywhere else.  I think.

The key question I think is could there ever be more than one Gather
sharing the same subpath?

> (To some lesser extent, apply_projection_to_path is always
> scary like that.)

Right, that's why there's also create_projection_path for when you
aren't sure.

> Mmmph.  That seems like a 2-bit solution, but I guess it would work.
> What if we taught create_projection_plan() to elide the Result node in
> that case?

Yeah, I was thinking about the same thing.  The comment block above
where you're looking would need some adjustment.
        regards, tom lane



On Wed, Mar 16, 2016 at 12:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah, I was thinking about the same thing.  The comment block above
> where you're looking would need some adjustment.

OK, how about this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Mar 16, 2016 at 12:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, I was thinking about the same thing.  The comment block above
>> where you're looking would need some adjustment.

> OK, how about this?

Looks pretty close.  One point is that if we do end up using a Result
node, then the parent GatherPath does not get charged for the Result
node's cpu_per_tuple overhead.  I'm not sure that that's worth changing
though.  It's probably better to bet that the subpath is projectable and
so no cost will ensue, than to bet the other way.
        regards, tom lane



On Wed, Mar 16, 2016 at 2:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Mar 16, 2016 at 12:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Yeah, I was thinking about the same thing.  The comment block above
>>> where you're looking would need some adjustment.
>
>> OK, how about this?
>
> Looks pretty close.  One point is that if we do end up using a Result
> node, then the parent GatherPath does not get charged for the Result
> node's cpu_per_tuple overhead.  I'm not sure that that's worth changing
> though.  It's probably better to bet that the subpath is projectable and
> so no cost will ensue, than to bet the other way.

I'm almost sure this way is the better bet.  I actually think at
present the GatherPath is always on top of a scan or join, and those
all project.  There might be other cases in the future that don't, but
I think it'd be fine to leave off worrying about this until we (a)
find a case where it happens and (b) failing to charge for the Result
causes a problem.  The current situation of never projecting in the
workers is far worse.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Mar 16, 2016 at 2:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Looks pretty close.  One point is that if we do end up using a Result
>> node, then the parent GatherPath does not get charged for the Result
>> node's cpu_per_tuple overhead.  I'm not sure that that's worth changing
>> though.  It's probably better to bet that the subpath is projectable and
>> so no cost will ensue, than to bet the other way.

> I'm almost sure this way is the better bet.

Actually, we do know what will happen ... so maybe
       /*        * We always use create_projection_path here, even if the subpath is        * projection-capable, so as
toavoid modifying the subpath in place.        * It seems unlikely at present that there could be any other        *
referencesto the subpath anyway, but better safe than sorry.        */
 
+       if (!is_projection_capable_path(gpath->subpath))
+           gpath->path.total_cost += cpu_tuple_cost * gpath->subpath->rows;       gpath->subpath = (Path *)
create_projection_path(root,                                 gpath->subpath->parent,
gpath->subpath,                                 target);
 

The comment could use adjustment if you adopt that, to reference the fact
that we know create_projection_plan will get rid of the Result if not
needed.
           regards, tom lane



On Wed, Mar 16, 2016 at 10:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Mar 16, 2016 at 12:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Yeah, I was thinking about the same thing.  The comment block above
> > where you're looking would need some adjustment.
>
> OK, how about this?
>

+ * children.  Alternatively, apply_projection_to_path might have created

+ * a projection path as the subpath of a Gather node even though the

+ * subpath was projection-capable.  So, if the subpath is capable of

+ * projection or the desired tlist is the same expression-wise as the

+ * subplan's, just jam it in there.  We'll have charged for a Result that

+ * doesn't actually appear in the plan, but that's better than having a

+ * Result we don't need.

  */

- if (tlist_same_exprs(tlist, subplan->targetlist))

+ if (is_projection_capable_path(best_path->subpath) ||

+ tlist_same_exprs(tlist, subplan->targetlist))




While reading above code changes, it looks like it is assuming that subpath and subplan will always be same (as it is verifying projection capability of subpath and attaching the tlist to subplan), but I think it is possible that subpath and subplan correspond to different nodes when gating Result node is added on to top of scan plan by create_scan_plan().  I think it might be better to explain in comments, why it is safe to rely on projection capability of subpath to attach tlist to subplan.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes:
> While reading above code changes, it looks like it is assuming that subpath
> and subplan will always be same (as it is verifying projection capability
> of subpath and attaching the tlist to subplan), but I think it is possible
> that subpath and subplan correspond to different nodes when gating Result
> node is added on to top of scan plan by create_scan_plan().

A little more thought will show you that that's not actually relevant,
because the tlist computation would have happened (or not) below the
gating Result.  If gating Results had an impact on
apply_projection_to_path's decisions we'd have had to do something about
that before this.
        regards, tom lane



On Thu, Mar 17, 2016 at 7:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > While reading above code changes, it looks like it is assuming that subpath
> > and subplan will always be same (as it is verifying projection capability
> > of subpath and attaching the tlist to subplan), but I think it is possible
> > that subpath and subplan correspond to different nodes when gating Result
> > node is added on to top of scan plan by create_scan_plan().
>
> A little more thought will show you that that's not actually relevant,
> because the tlist computation would have happened (or not) below the
> gating Result.  If gating Results had an impact on
> apply_projection_to_path's decisions we'd have had to do something about
> that before this.
>

I understand that gating Results won't impact it, but it was just not apparent from looking at the code I had referred.  If you think it is quite obvious thing, then we can leave the comment as it is.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Wed, Mar 16, 2016 at 2:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Mar 16, 2016 at 2:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Looks pretty close.  One point is that if we do end up using a Result
>>> node, then the parent GatherPath does not get charged for the Result
>>> node's cpu_per_tuple overhead.  I'm not sure that that's worth changing
>>> though.  It's probably better to bet that the subpath is projectable and
>>> so no cost will ensue, than to bet the other way.
>
>> I'm almost sure this way is the better bet.
>
> Actually, we do know what will happen ... so maybe
>
>         /*
>          * We always use create_projection_path here, even if the subpath is
>          * projection-capable, so as to avoid modifying the subpath in place.
>          * It seems unlikely at present that there could be any other
>          * references to the subpath anyway, but better safe than sorry.
>          */
> +       if (!is_projection_capable_path(gpath->subpath))
> +           gpath->path.total_cost += cpu_tuple_cost * gpath->subpath->rows;
>         gpath->subpath = (Path *)
>             create_projection_path(root,
>                                    gpath->subpath->parent,
>                                    gpath->subpath,
>                                    target);
>
> The comment could use adjustment if you adopt that, to reference the fact
> that we know create_projection_plan will get rid of the Result if not
> needed.

OK, I've committed something along those lines.  Thanks for the
advice, and feel free to whack it around if you have an idea for
improving it further - though IMHO this is good enough for 9.6.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> OK, I've committed something along those lines.  Thanks for the
> advice, and feel free to whack it around if you have an idea for
> improving it further - though IMHO this is good enough for 9.6.

The committed patch looks fine to me.  WRT your commit-message comment
about pushing down only parallel-safe tlist expressions, I would strongly
advise leaving that for another day.  I have some irons in the fire
concerning ways to push down computation of index expressions (so that
we can consistently get f(x) out of an index on f(x) rather than
recomputing it), and managing parallel-safe expressions seems like it'd
benefit from that infrastucture too.
        regards, tom lane



On Fri, Mar 18, 2016 at 10:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> OK, I've committed something along those lines.  Thanks for the
>> advice, and feel free to whack it around if you have an idea for
>> improving it further - though IMHO this is good enough for 9.6.
>
> The committed patch looks fine to me.  WRT your commit-message comment
> about pushing down only parallel-safe tlist expressions, I would strongly
> advise leaving that for another day.  I have some irons in the fire
> concerning ways to push down computation of index expressions (so that
> we can consistently get f(x) out of an index on f(x) rather than
> recomputing it), and managing parallel-safe expressions seems like it'd
> benefit from that infrastucture too.

Roger.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company