On Sat, 11 Jul 2020 at 14:02, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Fri, Jul 10, 2020 at 6:19 PM David Rowley <dgrowleyml@gmail.com> wrote:
> > If we have to have a new GUC, my preference would be hashagg_mem,
> > where -1 means use work_mem and a value between 64 and MAX_KILOBYTES
> > would mean use that value. We'd need some sort of check hook to
> > disallow 0-63. I really am just failing to comprehend why we're
> > contemplating changing the behaviour of Hash Join here.
>
> I don't understand why parititonwise hash join consumes work_mem in
> the way it does. I assume that the reason is something like "because
> that behavior was the easiest to explain", or perhaps "people that use
> partitioning ought to be able to tune their database well". Or even
> "this design avoids an epic pgsql-hackers thread, because of course
> every hash table should get its own work_mem".
hmm yeah. It's unfortunate, but I'm not sure how I'd have implemented
it differently. The problem is made worse by the fact that we'll only
release the memory for the hash table during ExecEndHashJoin(). If the
planner had some ability to provide the executor with knowledge that
the node would never be rescanned, then the executor could release the
memory for the hash table after the join is complete. For now, we'll
need to live with the fact that an Append containing many children
doing hash joins will mean holding onto all that memory until the
executor is shutdown :-(
There's room to make improvements there, for sure, but not for PG13.
> > Of course, I
> > understand that that node type also uses a hash table, but why does
> > that give it the right to be involved in a change that we're making to
> > try and give users the ability to avoid possible regressions with Hash
> > Agg?
>
> It doesn't, exactly. The idea of hash_mem came from similar settings
> in another database system that you'll have heard of, that affect all
> nodes that use a hash table. I read about this long ago, and thought
> that it might make sense to do something similar as a way to improving
> work_mem
It sounds interesting, but it also sounds like a new feature
post-beta. Perhaps it's better we minimise the scope of the change to
be a minimal fix just for the behaviour we predict some users might
not like.
David