Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Default setting for enable_hashagg_disk
Date
Msg-id 20200625195607.GA16766@momjian.us
Whole thread Raw
In response to Re: Default setting for enable_hashagg_disk  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Default setting for enable_hashagg_disk  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Thu, Jun 25, 2020 at 03:24:42PM -0400, Bruce Momjian wrote:
> On Thu, Jun 25, 2020 at 11:10:58AM -0700, Jeff Davis wrote:
> > On Wed, 2020-06-24 at 13:29 -0400, Tom Lane wrote:
> > > If we feel we need something to let people have the v12 behavior
> > > back, let's have
> > > (1) enable_hashagg on/off --- controls planner, same as it ever was
> > > (2) enable_hashagg_spill on/off --- controls executor by disabling
> > > spill
> > > 
> > > But I'm not really convinced that we need (2).
> > 
> > If we're not going to have a planner GUC, one alternative is to just
> > penalize the disk costs of HashAgg for a release or two. It would only
> > affect the cost of HashAgg paths that are expected to spill, which
> > weren't even generated in previous releases.
> > 
> > In other words, multiply the disk costs by enough that the planner will
> > usually not choose HashAgg if expected to spill unless the average
> > group size is quite large (i.e. there are a lot more tuples than
> > groups, but still enough groups to spill).
> 
> Well, the big question is whether this costing is actually more accurate
> than what we have now.  What I am hearing is that spilling hash agg is
> expensive, so whatever we can do to reflect the actual costs seems like
> a win.  If it can be done, it certainly seems better than a cost setting
> few people will use.

It is my understanding that spill of sorts is mostly read sequentially,
while hash reads are random.  Is that right?  Is that not being costed
properly?

That doesn't fix the misestimation case, but increasing work mem does
allow pre-PG 13 behavior there.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Open Item: Should non-text EXPLAIN always show properties?
Next
From: Tom Lane
Date:
Subject: Re: should libpq also require TLSv1.2 by default?