Re: EXPLAIN VERBOSE with parallel Aggregate - Mailing list pgsql-hackers

From David Rowley
Subject Re: EXPLAIN VERBOSE with parallel Aggregate
Date
Msg-id CAKJS1f8OhCnR6ZHE3NdPvc4xWNJJz_FK15P29Tt=hmiT8NSxJA@mail.gmail.com
Whole thread Raw
In response to Re: EXPLAIN VERBOSE with parallel Aggregate  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: EXPLAIN VERBOSE with parallel Aggregate  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 27 April 2016 at 12:37, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Apr 26, 2016 at 6:44 PM, David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> On 27 April 2016 at 08:46, Robert Haas <robertmhaas@gmail.com> wrote:
>>> My proposed fix for this issue is attached.  Review is welcome;
>>> otherwise, I'll just commit this.  The output looks like what I
>>> suggested upthread:
>>
>> + if (!aggref->aggpartial)
>> + elog(ERROR, "referenced Aggref is not partial");
>>
>> I think this is overly restrictive; ruleutils seems a bit out of line
>> here to say that plans can't have > 1 combine aggregate node.
>>
>> When coding the combine aggs stuff I had test code to inject
>> additional combine paths to make sure everything worked as expected.
>> It did, but it won't if you add these two lines. I'd say just remove
>> them.
>>
>> If you apply the attached and execute;
>>
>> create table t1 (num int not null);
>> insert into t1 select generate_series(1,2000000);
>>
>> explain verbose select avg(num) from t1;
>>
>> You'll get;
>>
>> ERROR: referenced Aggref is not partial
>
> In this test patch, should aggpath be using partial_grouping_target
> rather than target?  I feel like we need to use
> partial_grouping_target unless finalizeAggs is true.

Yes. I should also have removed the HAVING clause from the path.

After having changed that, I do still get the error.

I changed it to a NOTICE for now;

# explain verbose select avg(num) from t1 having sum(num) > 0;
NOTICE:  referenced Aggref is not partial
NOTICE:  referenced Aggref is not partial                                           QUERY PLAN
---------------------------------------------------------------------------------------------------Finalize Aggregate
(cost=22350.24..22350.25rows=1 width=32)  Output: avg(num)  Filter: (sum(t1.num) > 0)  ->  Partial Aggregate
(cost=22350.22..22350.23rows=1 width=40)        Output: avg(num), sum(num)        ->  Gather  (cost=22350.00..22350.21
rows=2width=40)              Output: (PARTIAL avg(num)), (PARTIAL sum(num))              Workers Planned: 2
->  Partial Aggregate  (cost=21350.00..21350.01 rows=1 width=40)                    Output: PARTIAL avg(num), PARTIAL
sum(num)                   ->  Parallel Seq Scan on public.t1
 
(cost=0.00..17183.33 rows=833333 width=4)                          Output: num
(12 rows)

You can see that the middle aggregate node properly includes the
sum(num) from the having clause, so is using the partial target list.

I'd also have expected the output of both partial nodes to be the
same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
or have I made some other mistake?

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Re: [COMMITTERS] pgsql: pg_upgrade: Convert old visibility map format to new format.
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Support for N synchronous standby servers - take 2