Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators
Date
Msg-id 136be49cf948899b2e51a416235d51506fa57c52.camel@j-davis.com
Whole thread Raw
In response to Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators  (James Hunter <james.hunter.pg@gmail.com>)
List pgsql-hackers
On Wed, 2025-02-26 at 13:27 -0800, James Hunter wrote:
> Attaching a new refactoring, which splits the code changes into
> patches by functionality. This refactoring yields 5 patches, each of
> which is relatively localized. I hope that the result will be more
> focused and more feasible to review.

Thank you, yes, this is helpful.

Taking a step back:

My idea was that we'd attach work_mem to each Path node and Plan node
at create time. For example, in create_agg_path() it could do:

  pathnode->path.node_work_mem = work_mem;

And then add to copy_generic_path_info():

  dest->node_work_mem = src->node_work_mem;

(and similarly for other nodes; at least those that care about
work_mem)

Then, at execution time, use node->ps.ss.plan.node_work_mem instead of
work_mem.

Similarly, we could track the node_mem_wanted, which would be the
estimated amount of memory the node would use if unlimited memory were
available. I believe that's already known (or a simple calculation) at
costing time, and we can propagate it from the Path to the Plan the
same way.

(A variant of this approach could carry the values into the PlanState
nodes as well, and the executor would that value instead.)

Extensions like yours could have a GUC like ext.query_work_mem and use
planner_hook to modify plan after standard_planner is done, walking the
tree and adjusting each Plan node's node_work_mem to obey
ext.query_work_mem. Another extension might hook in at path generation
time with set_rel_pathlist_hook or set_join_pathlist_hook to create
paths with lower node_work_mem settings that total up to
ext.query_work_mem. Either way, the node_work_mem settings would end up
being enforced by the executor by tracking memory usage and spilling
when it exceeds node->ps.ss.plan.node_work_mem.


Your patch 0001 looks like it makes it easier to find all the relevant
Plan nodes, so that seems like a reasonable step (didn't look at the
details).

But your patch 0002 and 0003 are entirely different from what I
expected. I just don't understand why we need anything as complex or
specific as ExecAssignWorkMem(). If we just add it at the time the Path
is created, and then propagate it to the plan with
copy_generic_path_info(), that would be a lot less code. What am I
missing?

Regards,
    Jeff Davis




pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: suspicious lockup on widowbird in AdvanceXLInsertBuffer (could it be due to 6a2275b8953?)
Next
From: Michael Paquier
Date:
Subject: Re: Spinlock can be released twice in procsignal.c