Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Default setting for enable_hashagg_disk
Date
Msg-id 1e75737369e1b6182ebaef7d6bbe098ad620bbb6.camel@j-davis.com
Whole thread Raw
In response to Re: Default setting for enable_hashagg_disk  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, 2020-06-24 at 15:28 -0400, Robert Haas wrote:
> On Wed, Jun 24, 2020 at 3:14 PM Andres Freund <andres@anarazel.de>
> wrote:
> > FWIW, my gut feeling is that we'll end up have to separate the
> > "execution time" spilling from using plain work mem, because it'll
> > trigger spilling too often. E.g. if the plan isn't expected to
> > spill,
> > only spill at 10 x work_mem or something like that.  Or we'll need
> > better management of temp file data when there's plenty memory
> > available.

...

> I think that's actually pretty appealing. Separating the memory we
> plan to use from the memory we're willing to use before spilling
> seems
> like a good idea in general, and I think we should probably also do
> it
> in other places - like sorts.

I'm trying to make sense of this. Let's say there are two GUCs:
planner_work_mem=16MB and executor_work_mem=32MB.

And let's say a query comes along and generates a HashAgg path, and the
planner (correctly) thinks if you put all the groups in memory at once,
it would be 24MB. Then the planner, using planner_work_mem, would think
spilling was necessary, and generate a cost that involves spilling.

Then it's going to generate a Sort+Group path, as well. And perhaps it
estimates that sorting all of the tuples in memory would also take
24MB, so it generates a cost that involves spilling to disk.

But it has to choose one of them. We've penalized plans at risk of
spilling to disk, but what's the point? The planner needs to choose one
of them, and both are at risk of spilling to disk.

Regards,
    Jeff Davis





pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why forbid "INSERT INTO t () VALUES ();"
Next
From: Jeff Davis
Date:
Subject: Re: Default setting for enable_hashagg_disk