Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Default setting for enable_hashagg_disk
Date
Msg-id 20200712123648.js76j6ablk5nbxpo@development
Whole thread Raw
In response to Re: Default setting for enable_hashagg_disk  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers
On Sat, Jul 11, 2020 at 10:26:22PM -0700, David G. Johnston wrote:
>On Saturday, July 11, 2020, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> "David G. Johnston" <david.g.johnston@gmail.com> writes:
>> > On Sat, Jul 11, 2020 at 5:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> >> It seems like a lot of the disagreement here is focused on Peter's
>> >> proposal to make hash_mem_multiplier default to 2.0.  But it doesn't
>> >> seem to me that that's a critical element of the proposal.  Why not just
>> >> make it default to 1.0, thus keeping the default behavior identical
>> >> to what it is now?
>>
>> > If we don't default it to something other than 1.0 we might as well just
>> > make it memory units and let people decide precisely what they want to
>> use
>> > instead of adding the complexity of a multiplier.
>>
>> Not sure how that follows?  The advantage of a multiplier is that it
>> tracks whatever people might do to work_mem automatically.
>
>
>>
>I was thinking that setting -1 would basically do that.
>

I think Tom meant that the multiplier would automatically track any
changes to work_mem, and adjust the hash_mem accordingly. With -1 (and
the GUC in units) you could only keep it exactly equal to work_mem, but
then as soon as you change it you'd have to update both.

>>   In general
>> I'd view work_mem as the base value that people twiddle to control
>> executor memory consumption.  Having to also twiddle this other value
>> doesn't seem especially user-friendly.
>
>
>I’ll admit I don’t have a feel for what is or is not user-friendly when
>setting these GUCs in a session to override the global defaults.  But as
>far as the global defaults I say it’s a wash between (32mb, -1) -> (32mb,
>48mb) and (32mb, 1.0) -> (32mb, 1.5)
>
>If you want 96mb for the session/query hash setting it to 96mb is
>invariant, whilesetting it to 3.0 means it can change in the future if the
>system work_mem changes.  Knowing the multiplier is 1.5 and choosing 64mb
>for work_mem in the session is possible but also mutable and has
>side-effects.  If the user is going to set both values to make it invariant
>we are back to it being a wash.
>
>I don’t believe using a multiplier will promote better comprehension for
>why this setting exists compared to “-1 means use work_mem but you can
>override a subset if you want.”
>
>Is having a session level memory setting be mutable something we want to
>introduce?
>
>Is it more user-friendly?
>

I still think it should be in simple units, TBH. We already have
somewhat similar situation with cost parameters, where we often say that
seq_page_cost = 1.0 is the baseline for the other cost parameters, yet
we have not coded that as multipliers.

>>> If we find that's a poor default, we can always change it later;
>> >> but it seems to me that the evidence for a higher default is
>> >> a bit thin at this point.
>>
>> > So "your default is 1.0 unless you installed the new database on or after
>> > 13.4 in which case it's 2.0"?
>>
>> What else would be new?  See e.g. 848ae330a.  (Note I'm not suggesting
>> that we'd change it in a minor release.)
>>
>
>Minor release update is what I had thought, and to an extent was making
>possible by not using the multiplier upfront.
>
>I agree options are wide open come v14 and beyond.
>
>David J.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Default setting for enable_hashagg_disk
Next
From: Michael Paquier
Date:
Subject: Re: A patch for get origin from commit_ts.