Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Default setting for enable_hashagg_disk
Date
Msg-id CAH2-Wzmt+b6=TQmavHKbsiD9g4R+iFaMEK89P88Bq5w=vgdnTQ@mail.gmail.com
Whole thread Raw
In response to Re: Default setting for enable_hashagg_disk  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Default setting for enable_hashagg_disk  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Mon, Jun 29, 2020 at 2:22 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Can you give and example of what you mean by being too permissive or too
> conservative? Do you mean the possibility of unlimited memory usage in
> v12, and strict enforcement in v13?

Yes -- that's all I meant.

> IMO enforcing the work_mem limit (in v13) is right in principle, but I
> do understand the concerns about unexpected regressions compared to v12.

Yeah. Both of these two things are true at the same time.

> I don't understand what you mean by "less memory" when the whole issue
> is significantly exceeding work_mem?

I was just reiterating what I said a few times already: Not using an
in-memory hash aggregate when the amount of memory required is high
but not prohibitively high is penny wise, pound foolish. It's easy to
imagine this actually using up more memory when an entire workload is
considered. This observation does not apply to a system that only ever
has one active connection at a time, but who cares about that?

> I don't think the OOM is the only negative performance here - using a
> lot of memory also forces eviction of data from page cache (although
> writing a lot of temporary files may have similar effect).

True.

> I agree with this, and I'm mostly OK with having hash_mem. In fact, from
> the proposals in this thread I like it the most - as long as it's used
> both during planning and execution. It's a pretty clear solution.

Great.

It's not trivial to write the patch, since there are a few tricky
cases. For example, maybe there is some subtlety in nodeAgg.c with
AGG_MIXED cases. Is there somebody else that knows that code better
than I do that wants to have a go at writing a patch?

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Default setting for enable_hashagg_disk
Next
From: Peter Geoghegan
Date:
Subject: Re: track_planning causing performance regression