Home > mailing lists

Re: Trouble with hashagg spill I/O pattern and costing - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Trouble with hashagg spill I/O pattern and costing
Date	May 25, 2020 22:04:07
Msg-id	20200525220407.c7t5iqg272bb54nc@development Whole thread Raw
In response to	Re: Trouble with hashagg spill I/O pattern and costing (Jeff Davis <pgsql@j-davis.com>)
List	pgsql-hackers

Tree view

On Mon, May 25, 2020 at 11:36:42AM -0700, Jeff Davis wrote:
>On Mon, 2020-05-25 at 04:10 +0200, Tomas Vondra wrote:
>>      algorithm  master  prealloc  tlist  prealloc-tlist
>>      --------------------------------------------------
>>           hash    1365       437    368             213
>>           sort     226       214    224             215
>>
>> The sort row simply means "enable_hashagg = off" and AFAIK the
>> patches
>> should not have a lot of influence here - the prealloc does, but it's
>> fairly negligible.
>
>I also say a small speedup from the prealloc patch for Sort. I wrote if
>off initially, but I'm wondering if there's something going on there.
>Perhaps drawing K elements from the minheap at once is better for
>caching? If so, that's good news, because it means the prealloc list is
>a win-win.
>

True.


>>                           ->  Finalize HashAggregate
>>                                 Group Key: lineitem_1.l_partkey
>>                                 ->  Gather
>>                                       Workers Planned: 2
>>                                       ->  Partial HashAggregate
>>                                             Group Key:
>> lineitem_1.l_partkey
>>                                             ->  Parallel Seq Scan on
>> lineitem lineitem_1
>>      (20 rows)
>
>Although each worker here only gets half the tuples, it will get
>(approximately) all of the *groups*. This may partly explain why the
>planner moves away from this plan when there are more workers: the
>number of hashagg batches doesn't go down much with more workers.
>
>It also might be interesting to know the estimate for the number of
>groups relative to the size of the table. If those two are close, it
>might look to the planner like processing the whole input in each
>worker isn't much worse than processing all of the groups in each
>worker.
>

Hmmm, yeah. The number of groups per worker is another moving part. But
isn't the number of groups per worker pretty much the same (and equal to
the total number of groups) in all plans? I mean all the plans (both
hash and sort based) have this:

   ->  Finalize HashAggregate  (cost=18313351.98..18654393.43 rows=14949762 width=36)
         Group Key: lineitem_1.l_partkey
         Planned Partitions: 64
         ->  Gather  (cost=14993231.96..17967638.74 rows=14949762 width=36)
               Workers Planned: 1
               ->  Partial HashAggregate  (cost=14992231.96..16471662.54 rows=14949762 width=36)
                     Group Key: lineitem_1.l_partkey
                     Planned Partitions: 64
                     ->  Parallel Seq Scan on lineitem lineitem_1  (cost=0.00..11083534.91 rows=264715991 width=9)

I think it's rather that we actually expect this number of rows from
each worker, so the total cost is

     parallel_tuple_cost * num_of_workers * num_of_groups

But we just ignore this inherent cost when picking the number of
workers, don't we? Because we don't know how many rows will be produced
and passed to the Gather in the end.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: David Gilman
Date: 25 May 2020, 21:55:26
Subject: Re: Warn when parallel restoring a custom dump without data offsets

From: Tom Lane
Date: 25 May 2020, 22:18:25
Subject: Re: factorial function/phase out postfix operators?

Re: Trouble with hashagg spill I/O pattern and costing - Mailing list pgsql-hackers

Previous

Next