Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Default setting for enable_hashagg_disk
Date
Msg-id 8fd82a9e6c00b73191d18940b2da6a94a30477af.camel@j-davis.com
Whole thread Raw
In response to Re: Default setting for enable_hashagg_disk  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Default setting for enable_hashagg_disk  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Thu, 2020-06-25 at 13:17 -0400, Bruce Momjian wrote:
> Frankly, if it took me this long to get my head around this, I am
> unclear how many people will understand this tuning feature enough to
> actually use it.

The way I think about it is that v13 HashAgg is much more consistent
with the way we do everything else: the planner costs it (including any
spilling that is expected), and the executor executes it (including any
spilling that is required to obey work_mem).

In earlier versions, HashAgg was weird. If we add GUCs to get that
weird behavior back, then the GUCs will necessarily be weird; and
therefore hard to document.

I would feel more comfortable with some kind of GUC escape hatch (or
two). GROUP BY is just too common, and I don't think we can ignore the
potential for users experiencing a regression of some type (even if, in
principle, the v13 version is better).

If we have the GUCs there, then at least if someone comes to the
mailing list with a problem, we can offer them a temporary solution,
and have time to try to avoid the problem in a future release (tweaking
estimates, cost model, defaults, etc.).

One idea is to have undocumented GUCs. That way we don't have to
support them forever, and we are more likely to hear problem reports.

Regards,
    Jeff Davis





pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Default setting for enable_hashagg_disk
Next
From: Jeff Davis
Date:
Subject: Re: Default setting for enable_hashagg_disk