Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Default setting for enable_hashagg_disk
Date
Msg-id CA+TgmobVAQH8USLTwuDGv9Ry9p5ay84uFWLa0oqDXOt9Diygtw@mail.gmail.com
Whole thread Raw
In response to Re: Default setting for enable_hashagg_disk  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Tue, Jul 14, 2020 at 6:49 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Maybe I missed your point here. The problem is not so much that we'll
> get HashAggs that spill -- there is nothing intrinsically wrong with
> that. While it's true that the I/O pattern is not as sequential as a
> similar group agg + sort, that doesn't seem like the really important
> factor here. The really important factor is that in-memory HashAggs
> can be blazingly fast relative to *any* alternative strategy -- be it
> a HashAgg that spills, or a group aggregate + sort that doesn't spill,
> whatever. We're mostly concerned about keeping the one available fast
> strategy than we are about getting a new, generally slow strategy.

I don't know; it depends. Like, if the less-sequential I/O pattern
that is caused by a HashAgg is not really any slower than a
Sort+GroupAgg, then whatever. The planner might as well try a HashAgg
- because it will be fast if it stays in memory - and if it doesn't
work out, we've lost little by trying. But if a Sort+GroupAgg is
noticeably faster than a HashAgg that ends up spilling, then there is
a potential regression. I thought we had evidence that this was a real
problem, but if that's not the case, then I think we're fine as far as
v13 goes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Floris Van Nee
Date:
Subject: RE: [PATCH] Keeps tracking the uniqueness with UniqueKey
Next
From: Pavel Borisov
Date:
Subject: Re: [PATCH] fix GIN index search sometimes losing results