Re: EXPLAIN VERBOSE with parallel Aggregate - Mailing list pgsql-hackers

From Robert Haas
Subject Re: EXPLAIN VERBOSE with parallel Aggregate
Date
Msg-id CA+TgmoahAiMte8e--pHaxDpasdHerBiZfdqT9DCDCppo7OfTxw@mail.gmail.com
Whole thread Raw
In response to EXPLAIN VERBOSE with parallel Aggregate  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: EXPLAIN VERBOSE with parallel Aggregate  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Apr 13, 2016 at 11:03 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> There's 2 problems:
>
> 1)
>
> I recently noticed that EXPLAIN VERBOSE is a bit bogus when it comes
> to parallel aggregates with FILTER (WHERE ...) clauses.
>
> We get;
>
> Output: pg_catalog.sum((sum(num) FILTER (WHERE (num > 0)))) FILTER
> (WHERE (num > 0))
>
> Which is simply a lie, we only filter on the partial aggregate, not the combine.
>
> The attached patch just nullifies the combine aggregate's aggfilter
> clause during setrefs. We cannot nullify it any sooner, as the
> aggfilter is required so that we find the correct partial Aggref in
> search_indexed_tlist_for_partial_aggref().
>
> With the attached we get;
>
>  Output: pg_catalog.sum((sum(num) FILTER (WHERE (num > 0))))
>
> The patch is very simple.
>
> 2)
>
> I'm unsure if the schema prefix on the combine aggregate is a problem
> or not. The code path which generates it is rather unfortunate and is
> down to func_get_detail() returning FUNCDETAIL_NOTFOUND in
> generate_function_name() due to not being able to find a function
> named "sum" with the transtype as its only argument. I had thought
> that maybe we should add a pg_proc entry for the aggregate with the
> transtype, if not already covered by the entry for aggfnoid.
> Aggregates where the transtype is the same as the input type work just
> fine;
>
> Output: max((max(num)))
>
> The problem with that is adding the pg_proc entry still won't get rid
> of the schema as the "p_funcid == funcid" test in
> generate_function_name() will fail causing the schema qualification to
> occur again. But at least func_get_detail() would be successful in
> finding the function.
>
> Any one have any thoughts on if this is a problem?

I definitely agree that the current output is messed up, but I'm not
sure your proposed output is much better.  I wonder if it shouldn't
say something like:

Output: serialfn(transfn(args))

for the partial aggregate and

Output: finalfn(combinefn(deserialfn(args)))

for the finalize aggregate step.

Or maybe just insert the word PARTIAL before each partial aggregate
step, like PARTIAL sum(num) for the partial step and then just
sum(num) for the final step.  I think ending up with sum(sum(num)) is
right out.  It doesn't look so bad for that case but avg(avg(num))
would certainly imply something that's not the actual behavior.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Odd system-column handling in postgres_fdw join pushdown patch
Next
From: Ashutosh Bapat
Date:
Subject: Re: Odd system-column handling in postgres_fdw join pushdown patch