Re: Combining Aggregates - Mailing list pgsql-hackers

From David Rowley
Subject Re: Combining Aggregates
Date
Msg-id CAKJS1f8rPq14pZWH_HmLH1gWPcRwzsyUOykaCTTPHdo5xacf7Q@mail.gmail.com
Whole thread Raw
In response to Re: Combining Aggregates  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Combining Aggregates  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On 19 January 2016 at 06:03, Pavel Stehule <pavel.stehule@gmail.com> wrote:


>
> # explain analyze select a%1000000,length(string_agg(b,',')) from ab group
> by 1;
>                                                         QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------------
>  GroupAggregate  (cost=119510.84..144510.84 rows=1000000 width=32) (actual
> time=538.938..1015.278 rows=1000000 loops=1)
>    Group Key: ((a % 1000000))
>    ->  Sort  (cost=119510.84..122010.84 rows=1000000 width=32) (actual
> time=538.917..594.194 rows=1000000 loops=1)
>          Sort Key: ((a % 1000000))
>          Sort Method: quicksort  Memory: 102702kB
>          ->  Seq Scan on ab  (cost=0.00..19853.00 rows=1000000 width=32)
> (actual time=0.016..138.964 rows=1000000 loops=1)
>  Planning time: 0.146 ms
>  Execution time: 1047.511 ms
>
>
> Patched
> # explain analyze select a%1000000,length(string_agg(b,',')) from ab group
> by 1;
>                                                        QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------
>  HashAggregate  (cost=24853.00..39853.00 rows=1000000 width=32) (actual
> time=8072.346..144424.872 rows=1000000 loops=1)
>    Group Key: (a % 1000000)
>    ->  Seq Scan on ab  (cost=0.00..19853.00 rows=1000000 width=32) (actual
> time=0.025..481.332 rows=1000000 loops=1)
>  Planning time: 0.164 ms
>  Execution time: 263288.332 ms

Well, that's pretty odd.  I guess the plan change must be a result of
switching the transition type from internal to text, although I'm not
immediately certain why that would make a difference.

It is strange, why hashaggregate is too slow?

Good question. I looked at this and found my VM was swapping like crazy. Upon investigation it appears that's because, since the patch creates a memory context per aggregated group, and in this case I've got 1 million of them, it means we create 1 million context, which are ALLOCSET_SMALL_INITSIZE (1KB) in size, which means about 1GB of memory, which is more than my VM likes. 

set work_mem = '130MB' does coax the planner into a GroupAggregate plan, which is faster, but due to the the hash agg executor code not giving any regard to work_mem. If I set work_mem to 140MB (which is more realistic for this VM), it does cause the same swapping problems to occur.  Probably setting aggtransspace for this aggregate to 1024 would help the costing problem, but it would also cause hashagg to be a less chosen option during planning. 

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: PATCH: postpone building buckets to the end of Hash (in HashJoin)
Next
From: Robert Haas
Date:
Subject: Re: Combining Aggregates