Re: possible optimization: push down aggregates - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: possible optimization: push down aggregates
Date
Msg-id CAFj8pRDiRxnipzAukjhUprmDsY5tW38H9yeXjdX4AMx60Kf+iA@mail.gmail.com
Whole thread Raw
In response to Re: possible optimization: push down aggregates  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-hackers



2014-08-27 21:46 GMT+02:00 Claudio Freire <klaussfreire@gmail.com>:
On Wed, Aug 27, 2014 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> Hi
>>
>> one user asked about using a partitioning for faster aggregates queries.
>>
>> I found so there is not any optimization.
>>
>> create table x1(a int, d date);
>> create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
>> create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
>> create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);
>>
>> When I have this schema, then optimizer try to do
>>
>> postgres=# explain verbose select max(a) from x1 group by d order by d;
>>                                    QUERY PLAN
>> --------------------------------------------------------------------------------
>>  GroupAggregate  (cost=684.79..750.99 rows=200 width=8)
>>    Output: max(x1.a), x1.d
>>    Group Key: x1.d
>>    ->  Sort  (cost=684.79..706.19 rows=8561 width=8)
>>          Output: x1.d, x1.a
>>          Sort Key: x1.d
>>          ->  Append  (cost=0.00..125.60 rows=8561 width=8)
>>                ->  Seq Scan on public.x1  (cost=0.00..0.00 rows=1 width=8)
>>                      Output: x1.d, x1.a
>>                ->  Seq Scan on public.x_1  (cost=0.00..31.40 rows=2140
>> width=8)
>>                      Output: x_1.d, x_1.a
>>                ->  Seq Scan on public.x_2  (cost=0.00..31.40 rows=2140
>> width=8)
>>                      Output: x_2.d, x_2.a
>>                ->  Seq Scan on public.x_3  (cost=0.00..31.40 rows=2140
>> width=8)
>>                      Output: x_3.d, x_3.a
>>                ->  Seq Scan on public.x_4  (cost=0.00..31.40 rows=2140
>> width=8)
>>                      Output: x_4.d, x_4.a
>>  Planning time: 0.333 ms
>>
>> It can be reduced to:
>>
>>   sort by d
>>   Append
>>        Aggegate (a), d
>>          seq scan from x_1
>>        Aggregate (a), d
>>          seq scan from x_2
>>
>> Are there some plans to use partitioning for aggregation?
>
> Besides min/max, what other aggregates (mean/stddev come to mind)
> would you optimize and how would you determine which ones could be?
> Where is that decision made?


You can't with mean and stddev, only with associative aggregates.

That's min, max, sum, bit_and, bit_or, bool_and, bool_or, count.

I don't think

I have a partitions by X .. and my query has group by clause GROUP BY X

so I can calculate any aggregate

Pavel

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: possible optimization: push down aggregates
Next
From: Merlin Moncure
Date:
Subject: Re: possible optimization: push down aggregates