Re: Default setting for enable_hashagg_disk - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Default setting for enable_hashagg_disk
Date
Msg-id 9dfde718f4994c14d88ab269b311b884cf5fa6bb.camel@j-davis.com
Whole thread Raw
In response to Re: Default setting for enable_hashagg_disk  (Andres Freund <andres@anarazel.de>)
Responses Re: Default setting for enable_hashagg_disk  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, 2020-06-25 at 09:37 -0700, Andres Freund wrote:
> > Let's say you have work_mem=32MB and a query that's expected to use
> > 16MB of memory. In reality, it uses 64MB of memory. So you are
> > saying
> > this query would get to use all 64MB of memory, right?
> > 
> > But then you run ANALYZE. Now the query is (correctly) expected to
> > use
> > 64MB of memory. Are you saying this query, executed again with
> > better
> > stats, would only get to use 32MB of memory, and therefore run
> > slower?
> 
> Yes. I think that's ok, because it was taken into account from a
> costing
> perspective int he second case.

What do you mean by "taken into account"?

There are only two possible paths: HashAgg and Sort+Group, and we need
to pick one. If the planner expects one to spill, it is likely to
expect the other to spill. If one spills in the executor, then the
other is likely to spill, too. (I'm ignoring the case with a lot of
tuples and few groups because that doesn't seem relevant.)

Imagine that there was only one path available to choose. Would you
suggest the same thing, that unexpected spills can exceed work_mem but
expected spills can't?
Regards,
    Jeff Davis





pgsql-hackers by date:

Previous
From: Jehan-Guillaume de Rorthais
Date:
Subject: Re: [patch] demote
Next
From: Robert Haas
Date:
Subject: Re: Default setting for enable_hashagg_disk