Home > mailing lists

Re: Trouble with hashagg spill I/O pattern and costing - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Trouble with hashagg spill I/O pattern and costing
Date	May 21, 2020 13:41:22
Msg-id	20200521134122.3rpmdolam4fgc5pv@development Whole thread Raw
In response to	Trouble with hashagg spill I/O pattern and costing (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses	Re: Trouble with hashagg spill I/O pattern and costing
List	pgsql-hackers

Tree view

On Tue, May 19, 2020 at 05:12:02PM +0200, Tomas Vondra wrote:
>
> ...
>
>The problem is that the hashagg plan runs in ~1400 seconds, while the
>groupagg only takes ~360. And per explain analyze, the difference really
>is in the aggregation - if we subtract the seqscan, the sort+groupagg
>takes about 310s:
>
>    ->  GroupAggregate  (cost=41772791.17..43305665.51 rows=6206695 width=36) (actual time=283378.004..335611.192
rows=6398981loops=1)
 
>          Group Key: lineitem_1.l_partkey
>          ->  Sort  (cost=41772791.17..42252715.81 rows=191969856 width=9) (actual time=283377.977..306182.393
rows=191969841loops=1)
 
>                Sort Key: lineitem_1.l_partkey
>                Sort Method: external merge  Disk: 3569544kB
>                ->  Seq Scan on lineitem lineitem_1  (cost=0.00..5519079.56 rows=191969856 width=9) (actual
time=0.019..28253.076rows=192000551 loops=1)
 
>
>while the hashagg takes ~1330s:
>
>    ->  HashAggregate  (cost=13977751.34..15945557.39 rows=6206695 width=36) (actual time=202952.170..1354546.897
rows=6400000loops=1)
 
>          Group Key: lineitem_1.l_partkey
>          Planned Partitions: 128
>          Peak Memory Usage: 4249 kB
>          Disk Usage: 26321840 kB
>          HashAgg Batches: 16512
>          ->  Seq Scan on lineitem lineitem_1  (cost=0.00..5519079.56 rows=191969856 width=9) (actual
time=0.007..22205.617rows=192000551 loops=1)
 
>
>And that's while only writing 26GB, compared to 35GB in the sorted plan,
>and with cost being ~16M vs. ~43M (so roughly inverse).
>

I've noticed I've actually made a mistake here - it's not 26GB vs. 35GB
in hash vs. sort, it's 26GB vs. 3.5GB. That is, the sort-based plan
writes out *way less* data to the temp file.

The reason is revealed by explain verbose:

   ->  GroupAggregate
         Output: lineitem_1.l_partkey, (0.2 * avg(lineitem_1.l_quantity))
         Group Key: lineitem_1.l_partkey
         ->  Sort
               Output: lineitem_1.l_partkey, lineitem_1.l_quantity
               Sort Key: lineitem_1.l_partkey
               ->  Seq Scan on public.lineitem lineitem_1
                     Output: lineitem_1.l_partkey, lineitem_1.l_quantity

   ->  HashAggregate
         Output: lineitem_1.l_partkey, (0.2 * avg(lineitem_1.l_quantity))
         Group Key: lineitem_1.l_partkey
         ->  Seq Scan on public.lineitem lineitem_1
               Output: lineitem_1.l_orderkey, lineitem_1.l_partkey,
                       lineitem_1.l_suppkey, lineitem_1.l_linenumber,
                       lineitem_1.l_quantity, lineitem_1.l_extendedprice,
                       lineitem_1.l_discount, lineitem_1.l_tax,
                       lineitem_1.l_returnflag, lineitem_1.l_linestatus,
                       lineitem_1.l_shipdate, lineitem_1.l_commitdate,
                       lineitem_1.l_receiptdate, lineitem_1.l_shipinstruct,
                       lineitem_1.l_shipmode, lineitem_1.l_comment

It seems that in the hashagg case we're not applying projection in the
seqscan, forcing us to serialize way much data (the whole lineitem
table, essentially).

It's probably still worth tweaking the I/O pattern, I think.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Robert Haas
Date: 21 May 2020, 13:36:43
Subject: Re: factorial function/phase out postfix operators?

From: Robert Haas
Date: 21 May 2020, 13:41:54
Subject: Re: WIP/PoC for parallel backup

Re: Trouble with hashagg spill I/O pattern and costing - Mailing list pgsql-hackers

Previous

Next