Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Default setting for enable_hashagg_disk
Date
Msg-id CAH2-WzkwrhhdVudaYNvzrZQcGd4skahT6cuYMdOPagh=TyHFyA@mail.gmail.com
Whole thread Raw
In response to Re: Default setting for enable_hashagg_disk  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Default setting for enable_hashagg_disk  ("David G. Johnston" <david.g.johnston@gmail.com>)
Re: Default setting for enable_hashagg_disk  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Jul 13, 2020 at 7:25 AM David Rowley <dgrowleyml@gmail.com> wrote:
> I think it would be good if we could try to move towards getting
> consensus here rather than reiterating our arguments over and over.

+1

> Updated summary:
> * For hash_mem = Tomas [7], Justin [16]
> * For hash_mem_multiplier with a default > 1.0 = DavidG [21]
> * For hash_mem_multiplier with default = 1.0 =  PeterG [15][0], Tom [20][24]
> * hash_mem out of scope for PG13 = Bruce [8], Andres [9]
> * hashagg_mem default to -1 meaning use work_mem = DavidR [23] (2nd preference)
> * Escape hatch that can be removed later when we get something better
> = Jeff [11], DavidR [12], Pavel [13], Andres [14], Justin [1]
> * Add enable_hashagg_spill = Tom [2] (I'm unclear on this proposal.
> Does it affect the planner or executor or both?) (updated opinion in
> [20])
> * Maybe do nothing until we see how things go during beta = Bruce [3], Amit [10]
> * Just let users set work_mem = Stephen [21], Alvaro [4] (Alvaro
> changed his mind after Andres pointed out that changes other nodes in
> the plan too [25])
> * Swap enable_hashagg for a GUC that specifies when spilling should
> occur. -1 means work_mem = Robert [17], Amit [18]
> * hash_mem does not solve the problem = Tomas [6] (changed his mind in [7])

I don't think that hashagg_mem needs to be considered here, because
you were the only one that spoke out in favor of that idea, and it's
your second preference in any case (maybe Tom was in favor of such a
thing at one point, but he clearly favors hash_mem/hash_mem_multiplier
now so it hardly matters). I don't think that hashagg_mem represents a
meaningful compromise between the escape hatch and
hash_mem/hash_mem_multiplier in any case. (I would *prefer* the escape
hatch to hashagg_mem, since at least the escape hatch is an "honest"
escape hatch.)

ISTM that there are three basic approaches to resolving this open item
that remain:

1. Do nothing.

2. Add an escape hatch.

3. Add hash_mem/hash_mem_multiplier.

Many people (e.g., Tom, Jeff, you, Andres, myself) have clearly
indicated that doing nothing is simply a non-starter. It's not just
that it doesn't get a lot of votes -- it's something that is strongly
opposed. We can rule it out right away.

This is where it gets harder. Many of us have views that are won't
easily fit into buckets. For example, even though I myself proposed
hash_mem/hash_mem_multiplier, I've said that I can live with the
escape hatch. Similarly, Jeff favors the escape hatch, but has said
that he can live with hash_mem/hash_mem_multiplier. And, Andres said
to me privately that he thinks that hash_mem could be a good idea,
even though he opposes it now due to release management
considerations.

Even still, I think that it's possible to divide people into two camps
on this without grossly misrepresenting anybody.

Primarily in favor of escape hatch:

Jeff,
DavidR,
Pavel,
Andres,
Robert ??,
Amit ??

Primarily in favor of hash_mem/hash_mem_multiplier:

PeterG,
Tom,
Alvaro,
Tomas,
Justin,
DavidG,
Jonathan Katz

There are clear problems with this summary, including for example the
fact that Robert weighed in before the hash_mem/hash_mem_multiplier
proposal was even on the table. What he actually said about it [1]
seems closer to hash_mem, so I feel that putting him in that bucket is
a conservative assumption on my part. Same goes for Amit, who warmed
to the idea of hash_mem_multiplier recently. (Though I probably got
some detail wrong, in which case please correct me.)

ISTM that there is a majority of opinion in favor of
hash_mem/hash_mem_multiplier. If you assume that I have this wrong,
and that we're simply deadlocked, then it becomes a matter for the
RMT. I strongly doubt that that changes the overall outcome, since
this year's RMT members happen to all be in favor of the
hash_mem/hash_mem_multiplier proposal on an individual basis.

[1] https://www.postgresql.org/message-id/CA+TgmobyV9+T-Wjx-cTPdQuRCgt1THz1mL3v1NXC4m4G-H6Rcw@mail.gmail.com
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Fabrízio de Royes Mello
Date:
Subject: Re: pg_dump bug for extension owned tables
Next
From: Tom Lane
Date:
Subject: Re: Proposal: Automatic partition creation