Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Default setting for enable_hashagg_disk
Date
Msg-id CAH2-Wzn6M7FwwgXe-mSnmCuMuF9OnFZW=m0X6JMOEYt4VfJTgQ@mail.gmail.com
Whole thread Raw
In response to Re: Default setting for enable_hashagg_disk  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Sat, Jun 27, 2020 at 3:41 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Well, there are multiple ideas discussed in this (sub)thread, one of
> them being a per-query memory limit. That requires decisions how much
> memory should different nodes get, which I think would need to be
> cost-based.

A design like that probably makes sense. But it's way out of scope for
Postgres 13, and not something that should be discussed further on
this thread IMV.

> Of course, a simpler scheme like this would not require that. And maybe
> introducing hash_mem is a good idea - I'm not particularly opposed to
> that, actually. But I think we should not introduce separate memory
> limits for each node type, which was also mentioned earlier.

I had imagined that hash_mem would apply to hash join and hash
aggregate only. A GUC that either represents a multiple of work_mem,
or an absolute work_mem-style KB value.

> The problem of course is that hash_mem does not really solve the issue
> discussed at the beginning of this thread, i.e. regressions due to
> underestimates and unexpected spilling at execution time.

Like Andres, I suspect that that's a smaller problem in practice. A
hash aggregate that spills often has performance characteristics
somewhat like a group aggregate + sort, anyway. I'm worried about
cases where an *in-memory* hash aggregate is naturally far far faster
than other strategies, and yet we can't use it -- despite the fact
that Postgres 12 could "safely" do so. (It probably doesn't matter
whether the slow plan that you get in Postgres 13 is a hash aggregate
that spills, or something else -- this is not really a costing
problem.)

Besides, hash_mem *can* solve that problem to some extent. Other cases
(cases where the estimates are so bad that hash_mem won't help) seem
like less of a concern to me. To some extent, that's the price you pay
to avoid the risk of an OOM.

> The thread is getting a rather confusing mix of proposals how to fix
> that for v13 and proposals how to improve our configuration of memory
> limits :-(

As I said to Amit in my last message, I think that all of the ideas
that are worth pursuing involve giving hash aggregate nodes license to
use more memory than other nodes. One variant involves doing so only
at execution time, while the hash_mem idea involves formalizing and
documenting that hash-based nodes are special -- and taking that into
account during both planning and execution.

> Interesting. What is not entirely clear to me how do these databases
> decide how much should each node get during planning. With the separate
> work_mem-like settings it's fairly obvious, but how do they do that with
> the global limit (either per-instance or per-query)?

I don't know, but that seems like a much more advanced way of
approaching the problem. It isn't in scope here.

Perhaps I'm not considering some unintended consequence of the planner
giving hash-based nodes extra memory "for free" in the common case
where hash_mem exceeds work_mem (by 2x, say). But my guess is that
that won't be a significant problem that biases the planner in some
obviously undesirable way.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: ModifyTable overheads in generic plans
Next
From: torikoshia
Date:
Subject: Re: Creating a function for exposing memory usage of backend process