Hello!
Since I used a lot of my time chasing short lived processes eating away big chunks of memory in recent weeks, I am wondering about a decent way to go about this.
The problem I am facing essentially relates to the fact that work_mem settings, while they are enforced per hash and sort node, aren't enforced globally.
One common case, that causes this problem more frequently than a few years ago, is the partitionwise_join. If there are a lot of partitions hash joined, we get a lot of hash nodes, each one potentially consuming work_mem.
While avoiding oom seems a big deal to me, my search didn't turn up previous hackers discussions about this. There is a good chance I am missing something here, so I'd appreciate any pointers.
The most reasonable solution seems to me to have a data structure per worker, that 1. tracks the amount of memory used by certain nodes and 2. offers a callback to let the node spill it's contents (almost) completely to disc. I am thinking about hash and sort nodes for now, since they affect memory usage a lot.
This would allow a node to spill other nodes contents to disc to avoid exceeding work_mem.
I'd love to hear your thoughts and suggestions!
Regards
Arne