Thread: EXPLAIN VERBOSE with parallel Aggregate

EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

14 April 2016, 03:03:24

There's 2 problems:

1)

I recently noticed that EXPLAIN VERBOSE is a bit bogus when it comes
to parallel aggregates with FILTER (WHERE ...) clauses.

We get;

Output: pg_catalog.sum((sum(num) FILTER (WHERE (num > 0)))) FILTER
(WHERE (num > 0))

Which is simply a lie, we only filter on the partial aggregate, not the combine.

The attached patch just nullifies the combine aggregate's aggfilter
clause during setrefs. We cannot nullify it any sooner, as the
aggfilter is required so that we find the correct partial Aggref in
search_indexed_tlist_for_partial_aggref().

With the attached we get;

 Output: pg_catalog.sum((sum(num) FILTER (WHERE (num > 0))))

The patch is very simple.

2)

I'm unsure if the schema prefix on the combine aggregate is a problem
or not. The code path which generates it is rather unfortunate and is
down to func_get_detail() returning FUNCDETAIL_NOTFOUND in
generate_function_name() due to not being able to find a function
named "sum" with the transtype as its only argument. I had thought
that maybe we should add a pg_proc entry for the aggregate with the
transtype, if not already covered by the entry for aggfnoid.
Aggregates where the transtype is the same as the input type work just
fine;

Output: max((max(num)))

The problem with that is adding the pg_proc entry still won't get rid
of the schema as the "p_funcid == funcid" test in
generate_function_name() will fail causing the schema qualification to
occur again. But at least func_get_detail() would be successful in
finding the function.

Any one have any thoughts on if this is a problem?


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

parallel_agg_filter_fix.patch

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

15 April 2016, 16:15:57

On Wed, Apr 13, 2016 at 11:03 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> There's 2 problems:
>
> 1)
>
> I recently noticed that EXPLAIN VERBOSE is a bit bogus when it comes
> to parallel aggregates with FILTER (WHERE ...) clauses.
>
> We get;
>
> Output: pg_catalog.sum((sum(num) FILTER (WHERE (num > 0)))) FILTER
> (WHERE (num > 0))
>
> Which is simply a lie, we only filter on the partial aggregate, not the combine.
>
> The attached patch just nullifies the combine aggregate's aggfilter
> clause during setrefs. We cannot nullify it any sooner, as the
> aggfilter is required so that we find the correct partial Aggref in
> search_indexed_tlist_for_partial_aggref().
>
> With the attached we get;
>
>  Output: pg_catalog.sum((sum(num) FILTER (WHERE (num > 0))))
>
> The patch is very simple.
>
> 2)
>
> I'm unsure if the schema prefix on the combine aggregate is a problem
> or not. The code path which generates it is rather unfortunate and is
> down to func_get_detail() returning FUNCDETAIL_NOTFOUND in
> generate_function_name() due to not being able to find a function
> named "sum" with the transtype as its only argument. I had thought
> that maybe we should add a pg_proc entry for the aggregate with the
> transtype, if not already covered by the entry for aggfnoid.
> Aggregates where the transtype is the same as the input type work just
> fine;
>
> Output: max((max(num)))
>
> The problem with that is adding the pg_proc entry still won't get rid
> of the schema as the "p_funcid == funcid" test in
> generate_function_name() will fail causing the schema qualification to
> occur again. But at least func_get_detail() would be successful in
> finding the function.
>
> Any one have any thoughts on if this is a problem?

I definitely agree that the current output is messed up, but I'm not
sure your proposed output is much better.  I wonder if it shouldn't
say something like:

Output: serialfn(transfn(args))

for the partial aggregate and

Output: finalfn(combinefn(deserialfn(args)))

for the finalize aggregate step.

Or maybe just insert the word PARTIAL before each partial aggregate
step, like PARTIAL sum(num) for the partial step and then just
sum(num) for the final step.  I think ending up with sum(sum(num)) is
right out.  It doesn't look so bad for that case but avg(avg(num))
would certainly imply something that's not the actual behavior.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Tom Lane

Date:

15 April 2016, 16:27:36

Robert Haas <robertmhaas@gmail.com> writes:
> I definitely agree that the current output is messed up, but I'm not
> sure your proposed output is much better.  I wonder if it shouldn't
> say something like:
> Output: serialfn(transfn(args))
> for the partial aggregate and
> Output: finalfn(combinefn(deserialfn(args)))
> for the finalize aggregate step.

> Or maybe just insert the word PARTIAL before each partial aggregate
> step, like PARTIAL sum(num) for the partial step and then just
> sum(num) for the final step.

+1 for the latter, if we can do it conveniently.  I think exposing
the names of the aggregate implementation functions would be very
user-unfriendly, as nobody but us hackers knows what those are.

> I think ending up with sum(sum(num)) is
> right out.  It doesn't look so bad for that case but avg(avg(num))
> would certainly imply something that's not the actual behavior.

Agreed.
        regards, tom lane

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

17 April 2016, 07:18:50

On 16 April 2016 at 04:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I definitely agree that the current output is messed up, but I'm not
>> sure your proposed output is much better.  I wonder if it shouldn't
>> say something like:
>> Output: serialfn(transfn(args))
>> for the partial aggregate and
>> Output: finalfn(combinefn(deserialfn(args)))
>> for the finalize aggregate step.
>
>> Or maybe just insert the word PARTIAL before each partial aggregate
>> step, like PARTIAL sum(num) for the partial step and then just
>> sum(num) for the final step.
>
> +1 for the latter, if we can do it conveniently.  I think exposing
> the names of the aggregate implementation functions would be very
> user-unfriendly, as nobody but us hackers knows what those are.

It does not really seem all that convenient to do this. It also seems
a bit strange to me to have a parent node report a column which does
not exist in any nodes descending from it. Remember that the combine
Aggref does not have the same ->args as its corresponding partial
Aggref. It's not all that clear to me if there is any nice way to do
have this work the way you'd like. If we were to just call
get_rule_expr() on the first arg of the combine aggregate node, it
would re-print the PARTIAL keyword again.

In the attached I have it displaying as:

postgres=# explain verbose select count(*),max(a),avg(a) filter (where
a = 0) from t;
                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=14739.56..14739.57 rows=1 width=44)
   Output: pg_catalog.count(*), max((max(a))), pg_catalog.avg((PARTIAL
avg(a) FILTER (WHERE (a = 0))))
   ->  Gather  (cost=14739.33..14739.54 rows=2 width=44)
         Output: (count(*)), (max(a)), (PARTIAL avg(a) FILTER (WHERE (a = 0)))
         Workers Planned: 2
         ->  Partial Aggregate  (cost=13739.33..13739.34 rows=1 width=44)
               Output: count(*), max(a), PARTIAL avg(a) FILTER (WHERE (a = 0))
               ->  Parallel Seq Scan on public.t  (cost=0.00..9572.67
rows=416667 width=4)
                     Output: a
(9 rows)

I like this much better, as there's no fudging of any function
arguments to print them as something they're not.

Note that max() does not get tagged with PARTIAL since it has no
finalfn. Obtaining this information does require a syscache lookup in
get_agg_expr(), but I've managed to make that only happen when
aggpartial is true.

>> I think ending up with sum(sum(num)) is
>> right out.  It doesn't look so bad for that case but avg(avg(num))
>> would certainly imply something that's not the actual behavior.
>
> Agreed.

Note that I've done nothing for the weird schema prefixing problem I
mentioned. I think I'd need input on the best way to solve this. If
it's actually a problem.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

parallel_agg_explain_verbose_v2.patch

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Tom Lane

Date:

17 April 2016, 14:22:32

David Rowley <david.rowley@2ndquadrant.com> writes:
> On 16 April 2016 at 04:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> +1 for the latter, if we can do it conveniently.  I think exposing
>> the names of the aggregate implementation functions would be very
>> user-unfriendly, as nobody but us hackers knows what those are.

> It does not really seem all that convenient to do this. It also seems
> a bit strange to me to have a parent node report a column which does
> not exist in any nodes descending from it. Remember that the combine
> Aggref does not have the same ->args as its corresponding partial
> Aggref. It's not all that clear to me if there is any nice way to do
> have this work the way you'd like. If we were to just call
> get_rule_expr() on the first arg of the combine aggregate node, it
> would re-print the PARTIAL keyword again.

This suggests to me that the parsetree representation for partial
aggregates was badly chosen.  If EXPLAIN can't recognize them, then
neither can anything else, and we may soon find needs for telling
the difference that are not just cosmetic.  Maybe we need more
fields in Aggref.

> Note that I've done nothing for the weird schema prefixing problem I
> mentioned. I think I'd need input on the best way to solve this. If
> it's actually a problem.

It sure looks like a problem to me.  Maybe it's only cosmetic, but
it's going to cause confusion and bug reports.  EXPLAIN output for
parallel queries is going to be confusing enough anyway; we need
to avoid having artifacts like this.
        regards, tom lane

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Noah Misch

Date:

20 April 2016, 01:23:06

On Sun, Apr 17, 2016 at 10:22:24AM -0400, Tom Lane wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > On 16 April 2016 at 04:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> +1 for the latter, if we can do it conveniently.  I think exposing
> >> the names of the aggregate implementation functions would be very
> >> user-unfriendly, as nobody but us hackers knows what those are.
> 
> > It does not really seem all that convenient to do this. It also seems
> > a bit strange to me to have a parent node report a column which does
> > not exist in any nodes descending from it. Remember that the combine
> > Aggref does not have the same ->args as its corresponding partial
> > Aggref. It's not all that clear to me if there is any nice way to do
> > have this work the way you'd like. If we were to just call
> > get_rule_expr() on the first arg of the combine aggregate node, it
> > would re-print the PARTIAL keyword again.
> 
> This suggests to me that the parsetree representation for partial
> aggregates was badly chosen.  If EXPLAIN can't recognize them, then
> neither can anything else, and we may soon find needs for telling
> the difference that are not just cosmetic.  Maybe we need more
> fields in Aggref.

[This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item.  Robert,
since you committed the patch believed to have created it, you own this open
item.  If that responsibility lies elsewhere, please let us know whose
responsibility it is to fix this.  Since new open items may be discovered at
any time and I want to plan to have them all fixed well in advance of the ship
date, I will appreciate your efforts toward speedy resolution.  Please
present, within 72 hours, a plan to fix the defect within seven days of this
message.  Thanks.

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

21 April 2016, 02:44:49

On Tue, Apr 19, 2016 at 9:22 PM, Noah Misch <noah@leadboat.com> wrote:
> On Sun, Apr 17, 2016 at 10:22:24AM -0400, Tom Lane wrote:
>> David Rowley <david.rowley@2ndquadrant.com> writes:
>> > On 16 April 2016 at 04:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> >> +1 for the latter, if we can do it conveniently.  I think exposing
>> >> the names of the aggregate implementation functions would be very
>> >> user-unfriendly, as nobody but us hackers knows what those are.
>>
>> > It does not really seem all that convenient to do this. It also seems
>> > a bit strange to me to have a parent node report a column which does
>> > not exist in any nodes descending from it. Remember that the combine
>> > Aggref does not have the same ->args as its corresponding partial
>> > Aggref. It's not all that clear to me if there is any nice way to do
>> > have this work the way you'd like. If we were to just call
>> > get_rule_expr() on the first arg of the combine aggregate node, it
>> > would re-print the PARTIAL keyword again.
>>
>> This suggests to me that the parsetree representation for partial
>> aggregates was badly chosen.  If EXPLAIN can't recognize them, then
>> neither can anything else, and we may soon find needs for telling
>> the difference that are not just cosmetic.  Maybe we need more
>> fields in Aggref.
>
> [This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 9.6 open item.  Robert,
> since you committed the patch believed to have created it, you own this open
> item.  If that responsibility lies elsewhere, please let us know whose
> responsibility it is to fix this.  Since new open items may be discovered at
> any time and I want to plan to have them all fixed well in advance of the ship
> date, I will appreciate your efforts toward speedy resolution.  Please
> present, within 72 hours, a plan to fix the defect within seven days of this
> message.  Thanks.

I'll do my best to work on this soon.  I'm not happy with the output
produced by David's patch, but I don't expect I'll be able to do
better without putting some time into it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

21 April 2016, 05:06:11

On 18 April 2016 at 02:22, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> Note that I've done nothing for the weird schema prefixing problem I
>> mentioned. I think I'd need input on the best way to solve this. If
>> it's actually a problem.
>
> It sure looks like a problem to me.  Maybe it's only cosmetic, but
> it's going to cause confusion and bug reports.  EXPLAIN output for
> parallel queries is going to be confusing enough anyway; we need
> to avoid having artifacts like this.

I'd like to admit that I'm a bit confused as to why
generate_function_name(), when it already knows the funcid, bothers to
call func_get_detail(), which performs a search for the function based
on the name and argument types, to find the function, most likely with
the same funcid as the one which it already knew.

Could you explain why this has to happen?

I also tried patching with the attached and running the regression
tests to see what breaks... nothing did. So it seems that, at least in
the tests that that code path is never hit.

With that in mind, perhaps the fix for the namespace problem is just
to special case the combine Aggrefs in get_agg_expr() and have it just
lookup the pg_proc entry for the aggred->aggfnoid, and use that
proname.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

generate_function_name_experiment.patch

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Tom Lane

Date:

21 April 2016, 05:14:21

David Rowley <david.rowley@2ndquadrant.com> writes:
> I'd like to admit that I'm a bit confused as to why
> generate_function_name(), when it already knows the funcid, bothers to
> call func_get_detail(), which performs a search for the function based
> on the name and argument types, to find the function, most likely with
> the same funcid as the one which it already knew.

> Could you explain why this has to happen?

The point is exactly to find out whether a search for the function
given only the name and argument types would find the same function,
or a similar function in a different schema --- which would happen
if that other schema is earlier in the search path than the desired
one, or maybe the desired one isn't in search_path at all.  In such
a case we must schema-qualify the function name in the printed
expression.

To some extent this is because ruleutils serves two masters.  We would
probably not care that much about schema exactness for EXPLAIN's purposes,
but we do care for dumping views and rules.
        regards, tom lane

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

22 April 2016, 00:57:23

On 21 April 2016 at 17:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> I'd like to admit that I'm a bit confused as to why
>> generate_function_name(), when it already knows the funcid, bothers to
>> call func_get_detail(), which performs a search for the function based
>> on the name and argument types, to find the function, most likely with
>> the same funcid as the one which it already knew.
>
>> Could you explain why this has to happen?
>
> The point is exactly to find out whether a search for the function
> given only the name and argument types would find the same function,
> or a similar function in a different schema --- which would happen
> if that other schema is earlier in the search path than the desired
> one, or maybe the desired one isn't in search_path at all.  In such
> a case we must schema-qualify the function name in the printed
> expression.

Thanks. That makes more sense now.

> To some extent this is because ruleutils serves two masters.  We would
> probably not care that much about schema exactness for EXPLAIN's purposes,
> but we do care for dumping views and rules.

OK, so here's my thoughts. Currently, as mentioned above, I've
included a PARTIAL prefix for partial aggregates. This is
syntactically incorrect for the dumping of views etc, but that should
not matter as partial Aggrefs never come from the parser, they're only
perhaps generated later in the planner. Same goes for combine
aggregates too.

In the attached I'm proposing that we simply just use the
pg_proc.proname which belongs to the aggref->aggfnoid for combine
aggregates. This gets around the function not being found by
generate_function_name() and the namespace problem, that code should
never be executed when getting a view def, since we can't have combine
aggs there.

The attached still does not get the output into the way Robert would
have liked, but I still stand by my dislike to pretending the combine
aggregate is a normal aggregate. It's not all that clear if FILTER
should be displayed in the combine agg. Combine Aggs don't do FILTER.

This makes the output:

postgres=# explain verbose select avg(num) FILTER (WHERE num >
0),sum(num),count(*) from i;
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=13758.56..13758.57 rows=1 width=48)
   Output: avg((PARTIAL avg(num) FILTER (WHERE (num > 0)))),
sum((sum(num))), count(*)
   ->  Gather  (cost=13758.33..13758.54 rows=2 width=48)
         Output: (PARTIAL avg(num) FILTER (WHERE (num > 0))),
(sum(num)), (count(*))
         Workers Planned: 2
         ->  Partial Aggregate  (cost=12758.33..12758.34 rows=1 width=48)
               Output: PARTIAL avg(num) FILTER (WHERE (num > 0)),
sum(num), count(*)
               ->  Parallel Seq Scan on public.i  (cost=0.00..8591.67
rows=416667 width=4)
                     Output: num

Comments welcome.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

parallel_agg_explain_verbose_v3.patch

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

22 April 2016, 18:35:12

On Thu, Apr 21, 2016 at 8:57 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> OK, so here's my thoughts. Currently, as mentioned above, I've
> included a PARTIAL prefix for partial aggregates. This is
> syntactically incorrect for the dumping of views etc, but that should
> not matter as partial Aggrefs never come from the parser, they're only
> perhaps generated later in the planner. Same goes for combine
> aggregates too.

Yeah.  As I've said a few times, I would like to have SQL syntax that
emits the unfinalized (but serialized, if type internal) values, so
that postgres_fdw can use that to partial aggregation to the remote
side.  Maybe we should consider what would be reasonable syntax for
this; but that's a second-order consideration for now.

> The attached still does not get the output into the way Robert would
> have liked, but I still stand by my dislike to pretending the combine
> aggregate is a normal aggregate. It's not all that clear if FILTER
> should be displayed in the combine agg. Combine Aggs don't do FILTER.

I am rather confused by this, for two reasons:

1. The "Output" line is supposed to display the columns that the plan
node is producing.  And a FinalizeAggregate had darned well better be
producing the same results as an Aggregate, so it's entirely
reasonable for it to produce the same output that an Aggregate would
have given us.

2. You're using the term "combine agg", but as far as the EXPLAIN
ANALYZE output is concerned, that's not a thing.  There is
PartialAggregate, Aggregate, and FinalizeAggregate.  I think you mean
FinalizeAggregate when you say "combine aggregate", since we identify
a node as a FinalizeAggregate by observing that combineStates = true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

22 April 2016, 21:19:19

On Sun, Apr 17, 2016 at 10:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> On 16 April 2016 at 04:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> +1 for the latter, if we can do it conveniently.  I think exposing
>>> the names of the aggregate implementation functions would be very
>>> user-unfriendly, as nobody but us hackers knows what those are.
>
>> It does not really seem all that convenient to do this. It also seems
>> a bit strange to me to have a parent node report a column which does
>> not exist in any nodes descending from it. Remember that the combine
>> Aggref does not have the same ->args as its corresponding partial
>> Aggref. It's not all that clear to me if there is any nice way to do
>> have this work the way you'd like. If we were to just call
>> get_rule_expr() on the first arg of the combine aggregate node, it
>> would re-print the PARTIAL keyword again.
>
> This suggests to me that the parsetree representation for partial
> aggregates was badly chosen.  If EXPLAIN can't recognize them, then
> neither can anything else, and we may soon find needs for telling
> the difference that are not just cosmetic.  Maybe we need more
> fields in Aggref.

So the basic problem is this code in setrefs.c:
       newvar = search_indexed_tlist_for_partial_aggref(aggref,
context->subplan_itlist,                                                       context->newvarno);       if (newvar)
  {           Aggref     *newaggref;           TargetEntry *newtle;

           /*            * Now build a new TargetEntry for the Aggref's arguments which is            * a single Var
whichreferences the corresponding AggRef in the            * node below.            */           newtle =
makeTargetEntry((Expr*) newvar, 1, NULL, false);           newaggref = (Aggref *) copyObject(aggref);
newaggref->args= list_make1(newtle);

So what ends up happening is that, before setrefs processing, the
FinalizeAggregate's aggref has the same argument list as what the user
originally specified, but during that process, we replace the original
argument list with a 1-element argument list that points to the output
of the partial aggregate step.  After that, it's unsurprising that the
deparsing logic goes wrong; consider especially the case of an
aggregate that originally took any number of arguments other than one.
Offhand, I see two somewhat reasonable strategies for trying to fix
this:

1. Add a field "List *origargs" to Aggref, and in this case set
newaggref->origargs to a copy of the original argument list.  Use
origargs for deparsing, unless it's NULL, in which case use args.  Or
alternative, always populate origargs, but in other cases just make it
equal to args.

2. Add a field "bool aggcombine" to args, and set it to true in this
case.  When we see that in deparsing, expect the argument list to be
one element long, a TargetEntry containing a Var.  Use that to dig out
the partial Aggref to which it points, and deparse that instead.  I
guess maybe get_variable() could be used for this purpose.

There might be another approach, too.  Thoughts?

(Note that I'm assuming here that the final aggregate's target list
output should match what we would have gotten from a regular
aggregate, despite the combining stage in the middle.  I think that's
correct; we're outputting the same thing, even though we computed it
differently.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

22 April 2016, 21:19:30

On 23 April 2016 at 06:35, Robert Haas <robertmhaas@gmail.com> wrote:
> 2. You're using the term "combine agg", but as far as the EXPLAIN
> ANALYZE output is concerned, that's not a thing.  There is
> PartialAggregate, Aggregate, and FinalizeAggregate.  I think you mean
> FinalizeAggregate when you say "combine aggregate", since we identify
> a node as a FinalizeAggregate by observing that combineStates = true.

I really do mean combine when I say combine. I'm pretty sure that I've
coded everything properly to work even if there was 10 stages of
aggregation, in this case 9 of those would be combine nodes, and only
1 of them a finalize node. I can see why you'd say that at a glace of
explain.c, but it's more relevant that agg->finalizeAggs is true,
giving the combine test is in the else if condition. The test for
combineStates is only there so we don't output Finalize Aggregate for
normal 1 stage aggregates.

I simply don't think FILTER should be shown if combineStates is true.
This means that it will only be shown at the first aggregate node, the
one with combineStates == false. I think this is a good idea due to
the fact that this is the only place where FILTERing is done.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

22 April 2016, 21:36:23

On 23 April 2016 at 09:19, Robert Haas <robertmhaas@gmail.com> wrote:
> 2. Add a field "bool aggcombine" to args, and set it to true in this
> case.  When we see that in deparsing, expect the argument list to be
> one element long, a TargetEntry containing a Var.  Use that to dig out
> the partial Aggref to which it points, and deparse that instead.  I
> guess maybe get_variable() could be used for this purpose.
>

I did do that at one stage. I think perhaps one of the above patches
did it that way. The problem was that because I was just detecting
combine Aggrefs and printing the first item in args, it meant the
PARTIAL word was printed again, and that was pretty bogus, since it
was not a partial agg. I didn't see a way to do this without adding
some extra bool parameter to that whole series of functions. But I
only looked at using get_rule_expr(). I didn't look at what
get_variable() is.

> There might be another approach, too.  Thoughts?
>
> (Note that I'm assuming here that the final aggregate's target list
> output should match what we would have gotten from a regular
> aggregate, despite the combining stage in the middle.  I think that's
> correct; we're outputting the same thing, even though we computed it
> differently.)

Please note that in this case the final and combine node are the same
node, so I'm confused by the "combining stage in the middle" part.
There's only 2 aggregate nodes. I'm not sure how one of those is in
the middle.

I really don't think that we should print FILTER details in a combine
aggregate node. We'd be claiming to be doing something that we're
actually not doing. Please see advance_aggregates() in nodeAgg.c, and
compare that to combine_aggregates(), which is used when combineStates
== true. Notice that only advance_aggregates() bothers with the
aggfilter clause.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

23 April 2016, 01:58:06

On Fri, Apr 22, 2016 at 5:36 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> I really don't think that we should print FILTER details in a combine
> aggregate node. We'd be claiming to be doing something that we're
> actually not doing. Please see advance_aggregates() in nodeAgg.c, and
> compare that to combine_aggregates(), which is used when combineStates
> == true. Notice that only advance_aggregates() bothers with the
> aggfilter clause.

I think you're wrong.  The Output line says what that node outputs,
not how it's computed.  And a FinalizeAggregate on top of a
PartialAggregate produces the same output as an Aggregate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

23 April 2016, 02:11:09

On 23 April 2016 at 13:58, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Apr 22, 2016 at 5:36 PM, David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> I really don't think that we should print FILTER details in a combine
>> aggregate node. We'd be claiming to be doing something that we're
>> actually not doing. Please see advance_aggregates() in nodeAgg.c, and
>> compare that to combine_aggregates(), which is used when combineStates
>> == true. Notice that only advance_aggregates() bothers with the
>> aggfilter clause.
>
> I think you're wrong.  The Output line says what that node outputs,
> not how it's computed.  And a FinalizeAggregate on top of a
> PartialAggregate produces the same output as an Aggregate.

The most basic thing I can think of to rationalise my thinking for this is:

# create table v (v varchar);
# create view v_v as select lower(v) v from v;
# explain verbose select upper(v) from v_v;                        QUERY PLAN
-------------------------------------------------------------Seq Scan on public.v  (cost=0.00..30.40 rows=1360
width=32) Output: upper(lower((v.v)::text))
 
(2 rows)

It seems that you're proposing that the aggregate equivalence of this should be;
                        QUERY PLAN
-------------------------------------------------------------Seq Scan on public.v  (cost=0.00..30.40 rows=1360
width=32) Output: upper((v)::text)
 
(2 rows)

which to me seems wrong, as it's hiding the fact that lower() was called.

My arguments don't seem to be holding much weight, so I'll back down,
although it would still be interesting to hear what others have to say
about this.

In any case, thanks for stepping up to fix this. I'm sure what you're
proposing will be much better than what's there at the moment.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

25 April 2016, 17:03:26

On Fri, Apr 22, 2016 at 10:11 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> The most basic thing I can think of to rationalise my thinking for this is:
>
> # create table v (v varchar);
> # create view v_v as select lower(v) v from v;
> # explain verbose select upper(v) from v_v;
>                          QUERY PLAN
> -------------------------------------------------------------
>  Seq Scan on public.v  (cost=0.00..30.40 rows=1360 width=32)
>    Output: upper(lower((v.v)::text))
> (2 rows)
>
> It seems that you're proposing that the aggregate equivalence of this should be;
>
>                          QUERY PLAN
> -------------------------------------------------------------
>  Seq Scan on public.v  (cost=0.00..30.40 rows=1360 width=32)
>    Output: upper((v)::text)
> (2 rows)
>
> which to me seems wrong, as it's hiding the fact that lower() was called.
>
> My arguments don't seem to be holding much weight, so I'll back down,
> although it would still be interesting to hear what others have to say
> about this.

Yeah, I'd be happy to have more people chime in.  I think your example
is interesting, but it doesn't persuade me, because the rule has
always been that EXPLAIN shows the output *columns*, not the output
*rows*.  The disappearance of some rows doesn't change the list of
output columns.  For scans and joins this rule is easy to apply; for
aggregates, where many rows become one, less so.  Some of the existing
choices there are certainly arguable, like the fact that FILTER is
shown anywhere at all, which seems like an artifact to me.  But I
think that now is not the time to rethink those decisions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

26 April 2016, 20:46:16

On Mon, Apr 25, 2016 at 1:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Yeah, I'd be happy to have more people chime in.  I think your example
> is interesting, but it doesn't persuade me, because the rule has
> always been that EXPLAIN shows the output *columns*, not the output
> *rows*.  The disappearance of some rows doesn't change the list of
> output columns.  For scans and joins this rule is easy to apply; for
> aggregates, where many rows become one, less so.  Some of the existing
> choices there are certainly arguable, like the fact that FILTER is
> shown anywhere at all, which seems like an artifact to me.  But I
> think that now is not the time to rethink those decisions.

My proposed fix for this issue is attached.  Review is welcome;
otherwise, I'll just commit this.  The output looks like what I
suggested upthread:

rhaas=# explain verbose select count(*), corr(aid, bid) from pgbench_accounts ;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=742804.22..742804.23 rows=1 width=16)
   Output: count(*), corr((aid)::double precision, (bid)::double precision)
   ->  Gather  (cost=742804.00..742804.21 rows=2 width=40)
         Output: (PARTIAL count(*)), (PARTIAL corr((aid)::double
precision, (bid)::double precision))
         Workers Planned: 2
         ->  Partial Aggregate  (cost=741804.00..741804.01 rows=1 width=40)
               Output: PARTIAL count(*), PARTIAL corr((aid)::double
precision, (bid)::double precision)
               ->  Parallel Seq Scan on public.pgbench_accounts
(cost=0.00..616804.00 rows=12500000 width=8)
                     Output: aid, bid, abalance, filler
(9 rows)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

parallel-aggregate-explain.patch

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Tom Lane

Date:

26 April 2016, 21:06:20

Robert Haas <robertmhaas@gmail.com> writes:
> My proposed fix for this issue is attached.  Review is welcome;
> otherwise, I'll just commit this.  The output looks like what I
> suggested upthread:

I haven't read the patch, but the sample output looks sane.
        regards, tom lane

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

26 April 2016, 22:44:23

On 27 April 2016 at 08:46, Robert Haas <robertmhaas@gmail.com> wrote:
> My proposed fix for this issue is attached.  Review is welcome;
> otherwise, I'll just commit this.  The output looks like what I
> suggested upthread:

+ if (!aggref->aggpartial)
+ elog(ERROR, "referenced Aggref is not partial");

I think this is overly restrictive; ruleutils seems a bit out of line
here to say that plans can't have > 1 combine aggregate node.

When coding the combine aggs stuff I had test code to inject
additional combine paths to make sure everything worked as expected.
It did, but it won't if you add these two lines. I'd say just remove
them.

If you apply the attached and execute;

create table t1 (num int not null);
insert into t1 select generate_series(1,2000000);

explain verbose select avg(num) from t1;

You'll get;

ERROR: referenced Aggref is not partial

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

extra_combine_node.patch

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

27 April 2016, 00:37:33

On Tue, Apr 26, 2016 at 6:44 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On 27 April 2016 at 08:46, Robert Haas <robertmhaas@gmail.com> wrote:
>> My proposed fix for this issue is attached.  Review is welcome;
>> otherwise, I'll just commit this.  The output looks like what I
>> suggested upthread:
>
> + if (!aggref->aggpartial)
> + elog(ERROR, "referenced Aggref is not partial");
>
> I think this is overly restrictive; ruleutils seems a bit out of line
> here to say that plans can't have > 1 combine aggregate node.
>
> When coding the combine aggs stuff I had test code to inject
> additional combine paths to make sure everything worked as expected.
> It did, but it won't if you add these two lines. I'd say just remove
> them.
>
> If you apply the attached and execute;
>
> create table t1 (num int not null);
> insert into t1 select generate_series(1,2000000);
>
> explain verbose select avg(num) from t1;
>
> You'll get;
>
> ERROR: referenced Aggref is not partial

In this test patch, should aggpath be using partial_grouping_target
rather than target?  I feel like we need to use
partial_grouping_target unless finalizeAggs is true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

27 April 2016, 01:14:25

On 27 April 2016 at 12:37, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Apr 26, 2016 at 6:44 PM, David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> On 27 April 2016 at 08:46, Robert Haas <robertmhaas@gmail.com> wrote:
>>> My proposed fix for this issue is attached.  Review is welcome;
>>> otherwise, I'll just commit this.  The output looks like what I
>>> suggested upthread:
>>
>> + if (!aggref->aggpartial)
>> + elog(ERROR, "referenced Aggref is not partial");
>>
>> I think this is overly restrictive; ruleutils seems a bit out of line
>> here to say that plans can't have > 1 combine aggregate node.
>>
>> When coding the combine aggs stuff I had test code to inject
>> additional combine paths to make sure everything worked as expected.
>> It did, but it won't if you add these two lines. I'd say just remove
>> them.
>>
>> If you apply the attached and execute;
>>
>> create table t1 (num int not null);
>> insert into t1 select generate_series(1,2000000);
>>
>> explain verbose select avg(num) from t1;
>>
>> You'll get;
>>
>> ERROR: referenced Aggref is not partial
>
> In this test patch, should aggpath be using partial_grouping_target
> rather than target?  I feel like we need to use
> partial_grouping_target unless finalizeAggs is true.

Yes. I should also have removed the HAVING clause from the path.

After having changed that, I do still get the error.

I changed it to a NOTICE for now;

# explain verbose select avg(num) from t1 having sum(num) > 0;
NOTICE:  referenced Aggref is not partial
NOTICE:  referenced Aggref is not partial                                           QUERY PLAN
---------------------------------------------------------------------------------------------------Finalize Aggregate
(cost=22350.24..22350.25rows=1 width=32)  Output: avg(num)  Filter: (sum(t1.num) > 0)  ->  Partial Aggregate
(cost=22350.22..22350.23rows=1 width=40)        Output: avg(num), sum(num)        ->  Gather  (cost=22350.00..22350.21
rows=2width=40)              Output: (PARTIAL avg(num)), (PARTIAL sum(num))              Workers Planned: 2
->  Partial Aggregate  (cost=21350.00..21350.01 rows=1 width=40)                    Output: PARTIAL avg(num), PARTIAL
sum(num)                   ->  Parallel Seq Scan on public.t1
 
(cost=0.00..17183.33 rows=833333 width=4)                          Output: num
(12 rows)

You can see that the middle aggregate node properly includes the
sum(num) from the having clause, so is using the partial target list.

I'd also have expected the output of both partial nodes to be the
same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
or have I made some other mistake?

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

27 April 2016, 01:56:17

On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> I'd also have expected the output of both partial nodes to be the
> same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
> or have I made some other mistake?

No, that's a defect in the patch.  I didn't consider that we need to
support nodes with finalizeAggs = false and combineStates = true,
which is why that ERROR was there.  Working on a fix now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

27 April 2016, 02:30:59

On Tue, Apr 26, 2016 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> I'd also have expected the output of both partial nodes to be the
>> same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
>> or have I made some other mistake?
>
> No, that's a defect in the patch.  I didn't consider that we need to
> support nodes with finalizeAggs = false and combineStates = true,
> which is why that ERROR was there.  Working on a fix now.

I think this version should work, provided you use
partial_grouping_target where needed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

parallel-aggregate-explain-v2.patch

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

27 April 2016, 02:57:32

On 27 April 2016 at 14:30, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Apr 26, 2016 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
>> <david.rowley@2ndquadrant.com> wrote:
>>> I'd also have expected the output of both partial nodes to be the
>>> same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
>>> or have I made some other mistake?
>>
>> No, that's a defect in the patch.  I didn't consider that we need to
>> support nodes with finalizeAggs = false and combineStates = true,
>> which is why that ERROR was there.  Working on a fix now.
>
> I think this version should work, provided you use
> partial_grouping_target where needed.

+static void get_special_variable(Node *node, deparse_context *context,
+ void *private);

"private" is reserved in C++? I understood we want our C code to
compile as C++ too, right? or did I get my wires crossed somewhere?


-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Andres Freund

Date:

27 April 2016, 03:00:55

On 2016-04-27 14:57:24 +1200, David Rowley wrote:
> "private" is reserved in C++? I understood we want our C code to
> compile as C++ too, right? or did I get my wires crossed somewhere?

Just headers. Our C code unfortunately is very far away from being C++
compliant.

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

27 April 2016, 03:12:24

On Tue, Apr 26, 2016 at 10:57 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On 27 April 2016 at 14:30, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Apr 26, 2016 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
>>> <david.rowley@2ndquadrant.com> wrote:
>>>> I'd also have expected the output of both partial nodes to be the
>>>> same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
>>>> or have I made some other mistake?
>>>
>>> No, that's a defect in the patch.  I didn't consider that we need to
>>> support nodes with finalizeAggs = false and combineStates = true,
>>> which is why that ERROR was there.  Working on a fix now.
>>
>> I think this version should work, provided you use
>> partial_grouping_target where needed.
>
> +static void get_special_variable(Node *node, deparse_context *context,
> + void *private);
>
> "private" is reserved in C++? I understood we want our C code to
> compile as C++ too, right? or did I get my wires crossed somewhere?

I can call it something other than "private", if you have a
suggestion; normally I would have used "context", but that's already
taken in this case.  private_context would work, I guess.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

27 April 2016, 03:38:18

On 27 April 2016 at 15:12, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Apr 26, 2016 at 10:57 PM, David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> On 27 April 2016 at 14:30, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Tue, Apr 26, 2016 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
>>>> <david.rowley@2ndquadrant.com> wrote:
>>>>> I'd also have expected the output of both partial nodes to be the
>>>>> same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
>>>>> or have I made some other mistake?
>>>>
>>>> No, that's a defect in the patch.  I didn't consider that we need to
>>>> support nodes with finalizeAggs = false and combineStates = true,
>>>> which is why that ERROR was there.  Working on a fix now.
>>>
>>> I think this version should work, provided you use
>>> partial_grouping_target where needed.
>>
>> +static void get_special_variable(Node *node, deparse_context *context,
>> + void *private);
>>
>> "private" is reserved in C++? I understood we want our C code to
>> compile as C++ too, right? or did I get my wires crossed somewhere?
>
> I can call it something other than "private", if you have a
> suggestion; normally I would have used "context", but that's already
> taken in this case.  private_context would work, I guess.

It's fine. After Andres' email I looked and saw many other usages of
C++ keywords in our C code. Perhaps it would be a good idea to name it
something else we wanted to work towards it, but it sounds like it's
not, so probably keep what you've got.

The patch looks good. The only other thing I thought about was perhaps
it would be a good idea to re-add the sanity checks in execQual.c.
Patch for that is attached.

I removed the aggoutputtype check, as I only bothered adding that in
the first place because I lost the aggpartial field in some previous
revision of the parallel aggregate developments. I'd say the
aggpartial check makes it surplus to requirements.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

execQual_sanity_checks.patch

Re: EXPLAIN VERBOSE with parallel Aggregate

From

Robert Haas

Date:

27 April 2016, 16:08:15

On Tue, Apr 26, 2016 at 11:38 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On 27 April 2016 at 15:12, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Apr 26, 2016 at 10:57 PM, David Rowley
>> <david.rowley@2ndquadrant.com> wrote:
>>> On 27 April 2016 at 14:30, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> On Tue, Apr 26, 2016 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>>> On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
>>>>> <david.rowley@2ndquadrant.com> wrote:
>>>>>> I'd also have expected the output of both partial nodes to be the
>>>>>> same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
>>>>>> or have I made some other mistake?
>>>>>
>>>>> No, that's a defect in the patch.  I didn't consider that we need to
>>>>> support nodes with finalizeAggs = false and combineStates = true,
>>>>> which is why that ERROR was there.  Working on a fix now.
>>>>
>>>> I think this version should work, provided you use
>>>> partial_grouping_target where needed.
>>>
>>> +static void get_special_variable(Node *node, deparse_context *context,
>>> + void *private);
>>>
>>> "private" is reserved in C++? I understood we want our C code to
>>> compile as C++ too, right? or did I get my wires crossed somewhere?
>>
>> I can call it something other than "private", if you have a
>> suggestion; normally I would have used "context", but that's already
>> taken in this case.  private_context would work, I guess.
>
> It's fine. After Andres' email I looked and saw many other usages of
> C++ keywords in our C code. Perhaps it would be a good idea to name it
> something else we wanted to work towards it, but it sounds like it's
> not, so probably keep what you've got.
>
> The patch looks good. The only other thing I thought about was perhaps
> it would be a good idea to re-add the sanity checks in execQual.c.
> Patch for that is attached.
>
> I removed the aggoutputtype check, as I only bothered adding that in
> the first place because I lost the aggpartial field in some previous
> revision of the parallel aggregate developments. I'd say the
> aggpartial check makes it surplus to requirements.

OK, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: EXPLAIN VERBOSE with parallel Aggregate

From

David Rowley

Date:

28 April 2016, 01:13:29

On 28 April 2016 at 04:08, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Apr 26, 2016 at 11:38 PM, David Rowley
>> The patch looks good. The only other thing I thought about was perhaps
>> it would be a good idea to re-add the sanity checks in execQual.c.
>> Patch for that is attached.
>>
>> I removed the aggoutputtype check, as I only bothered adding that in
>> the first place because I lost the aggpartial field in some previous
>> revision of the parallel aggregate developments. I'd say the
>> aggpartial check makes it surplus to requirements.
>
> OK, committed.

Great. Thank you.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services