Home > mailing lists

Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Default setting for enable_hashagg_disk
Date	June 25, 2020 18:16:23
Msg-id	20200625181623.b6g4goulga5fcabg@alap3.anarazel.de Whole thread Raw
In response to	Re: Default setting for enable_hashagg_disk (Jeff Davis <pgsql@j-davis.com>)
Responses	Re: Default setting for enable_hashagg_disk
List	pgsql-hackers

Tree view

Hi,

On 2020-06-25 10:44:42 -0700, Jeff Davis wrote:
> There are only two possible paths: HashAgg and Sort+Group, and we need
> to pick one. If the planner expects one to spill, it is likely to
> expect the other to spill. If one spills in the executor, then the
> other is likely to spill, too. (I'm ignoring the case with a lot of
> tuples and few groups because that doesn't seem relevant.)

There's also ordered index scan + Group. Which will often be vastly
better than Sort+Group, but still slower than HashAgg.

> Imagine that there was only one path available to choose. Would you
> suggest the same thing, that unexpected spills can exceed work_mem but
> expected spills can't?

I'm not saying what I propose is perfect, but I've yet to hear a better
proposal. Given that there *are* different ways to implement
aggregation, and that we use expected costs to choose, I think the
assumed costs are relevant.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Jeff Davis
Date: 25 June 2020, 18:10:58
Subject: Re: Default setting for enable_hashagg_disk

From: Bruce Momjian
Date: 25 June 2020, 18:25:12
Subject: Re: Default setting for enable_hashagg_disk

Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

Previous

Next