On Sat, 28 Dec 2024 at 08:14, James Hunter <james.hunter.pg@gmail.com> wrote:
> 2. We use this backend_work_mem to "adjust" work_mem values used by
> the executor. (I don't care about the optimizer right now -- optimizer
> just does its best to predict what will happen at runtime.)
While I do want to see improvements in this area, I think "don't care
about the optimizer" is going to cause performance issues. The
problem is that the optimizer takes into account what work_mem is set
to when calculating the costs of work_mem-consuming node types. See
costsize.c for usages of "work_mem". If you go and reduce the amount
of memory a given node can consume after the costs have been applied
then we may end up in a situation where some other plan would have
suited much better.
There's also the problem with what to do when you chop work_mem down
so far that the remaining size is just a pitiful chunk. For now,
work_mem can't go below 64 kilobytes. You might think that's a very
unlikely situation that it'd be chopped down so far, but with
partition-wise join and partition-wise aggregate, we could end up
using a work_mem per partition and if you have thousands of partitions
then you might end up reducing work_mem by quite a large amount.
I think the best solution to this is the memory grant stuff I talked
about in [1]. That does require figuring out which nodes will consume
the work_mem concurrently so that infrastructure you talked about to
do that would be a good step forward towards that, but that's probably
not the most difficult part of that idea.
I definitely encourage work in this area, but I think what you're
proposing might be swapping one problem for another problem.
David
[1] https://www.postgresql.org/message-id/CAApHDvrzacGEA1ZRea2aio_LAi2fQcgoK74bZGfBddg4ymW-Ow@mail.gmail.com