On Fri, Dec 27, 2024 at 5:48 PM Tomas Vondra <tomas@vondra.me> wrote:
> Whenever I've been thinking about this in the past, it wasn't clear to
> me how would we know when to start adjusting work_mem, because we don't
> know which nodes will actually use work_mem concurrently.
You certainly know the PostgreSQL source code better than I do, but
just looking over nodeHash[join].c, for example, it looks like we
don't call ExecHashTableDestroy() until ExecEndHashJoin() or
ExecReScanHashJoin(). I am assuming this means that we don't destroy
the hash table (and therefore don't free the work_mem) until we either
(a) end the executor for the entire query plan, or (b) rescan the hash
join.
Does PostgreSQL currently rescan Hash Joins when they are "no longer
needed," to free work_mem early? If so, then I would try to reuse this
existing logic to decide which nodes need work_mem concurrently.
If not, then all nodes that use work_mem actually use it
"concurrently," because we don't free that work_mem until we call
ExecutorEnd().
Or, is the problem that one query might generate and execute a second
query, recursively? (And the second might generate and execute a third
query, etc.?) For example, the first query might call a function that
starts a new portal and executes a second query, and so on. Is this
what you're thinking about? If so, I would model this pattern as each
level of recursion taking up, logically, a "new connection."
Thanks,
James