Re: Extremely slow HashAggregate in simple UNION query - Mailing list pgsql-performance

From Felix Geisendörfer
Subject Re: Extremely slow HashAggregate in simple UNION query
Date
Msg-id 87FEADEA-C74D-470B-9EC8-4C1CED040C25@felixge.de
Whole thread Raw
In response to Re: Extremely slow HashAggregate in simple UNION query  (Andres Freund <andres@anarazel.de>)
Responses Re: Extremely slow HashAggregate in simple UNION query  (Andres Freund <andres@anarazel.de>)
List pgsql-performance
Hi,

> On 20. Aug 2019, at 19:32, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2019-08-20 17:11:58 +0200, Felix Geisendörfer wrote:
>>
>> HashAggregate  (cost=80020.01..100020.01 rows=2000000 width=8) (actual time=19.349..23.123 rows=1 loops=1)
>
> FWIW, that's not a mis-estimate I'm getting on master ;).  Obviously
> that doesn't actually address your concern...

I suppose this is thanks to the new optimizer support functions
mentioned by Michael and Pavel?

Of course I'm very excited about those improvements, but yeah, my
real query is mis-estimating things for totally different reasons not
involving any SRFs.

>> I'm certainly a novice when it comes to PostgreSQL internals, but I'm
>> wondering if this could be fixed by taking a more dynamic approach for
>> allocating HashAggregate hash tables?
>
> Under-sizing the hashtable just out of caution will have add overhead to
> a lot more common cases. That requires copying data around during
> growth, which is far far from free. Or you can use hashtables that don't
> need to copy, but they're also considerably slower in the more common
> cases.

How does PostgreSQL currently handle the case where the initial hash
table is under-sized due to the planner having underestimated things?
Are the growth costs getting amortized by using an exponential growth
function?

Anyway, I can accept my situation to be an edge case that doesn't justify
making things more complicated.

>> 3. Somehow EXPLAIN gets confused by this and only ends up tracking 23ms of the query execution instead of 45ms [5].
>
> Well, there's plenty work that's not attributed to nodes. IIRC we don't
> track executor startup/shutdown overhead on a per-node basis.

I didn't know that, thanks for clarifying : ).


pgsql-performance by date:

Previous
From: Andres Freund
Date:
Subject: Re: Extremely slow HashAggregate in simple UNION query
Next
From: Andres Freund
Date:
Subject: Re: Extremely slow HashAggregate in simple UNION query