Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators - Mailing list pgsql-hackers
From | James Hunter |
---|---|
Subject | Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators |
Date | |
Msg-id | CAJVSvF4KD-jBZxXFsWEZH9XL4bEUXXDjQ==VvzCKAxrNBevhHg@mail.gmail.com Whole thread Raw |
In response to | Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators (Jeff Davis <pgsql@j-davis.com>) |
List | pgsql-hackers |
On Mon, Feb 24, 2025 at 6:54 PM Jeff Davis <pgsql@j-davis.com> wrote: > > On Mon, 2025-02-24 at 12:46 -0800, James Hunter wrote: > > Attached please find the patch set I mentioned, above, in [1]. It > > consists of 4 patches that serve as the building blocks for and a > > prototype of the "query_work_mem" GUC I proposed: > > I didn't look at the details yet. But from: > > https://www.postgresql.org/message-id/CAJVSvF7x_DLj7-JrXvMB4_j%2BjzuvjG_7iXNjx5KmLBTXHPNdGA%40mail.gmail.com > > I expected something much smaller in scope, where we just add a > "plan_work_mem" field to the Plan struct, copy the work_mem global GUC > to that field when we construct a Plan node, and then reference the > plan_work_mem instead of the GUC directly. What you describe is basically Patch 3: it copies the work_mem and/or work_mem * hash_mem_multiplier global GUCs onto a "workmem_limit" field on the Plan struct, and then references that field instead of the GUC. Patch 3 basically consists of a new file that copies these GUCs to the new field, along with small changes to all relevant execution nodes to reach that field instead of the global GUC. Excluding test changes, it adds 281 lines to new file "execWorkmem.c", and modifies 263 other lines across 29 other files; most files have < 5 lines modified. > Can you give a bit more context about why we need so many changes, > including test changes? So Patch 3 is what you describe, above. By itself, this does very little, so Patch 4 serves as a PoC or demo showing how a cloud service provider might use Patch 3's framework to provide better memory management. I don't think Patch 4 needs to go into core PostgreSQL, but I find it helpful in demonstrating how the "workmem" framework could be used. It adds a hook to allow an extension to override the "copy" function added in Patch 3. The hook stuff itself is pretty small. And then, to show how a useful extension could be written using that hook, Patch 4 includes a basic extension that implements the hook. However, the ability to override a field via hook is useful only if that hook has enough information to make an intelligent decision! So, we need Patch 1, which just copies the existing "workmem" *estimates*, from existing planner logic, onto a second Path / Plan field, this one just called "workmem". It could be renamed "workmem_estimate," or anything else -- the important thing is that this field is on the Plan, so the hook can look at it when deciding how much working memory to assign to that Plan node. The "workmem" field added by Patch 1 is analogous to the "cost" field the planner already exposes. Patch 1 adds ~200 lines to explain.c, to display workmem stats; and modifies around 600 lines in costsize.c and createplan.c, to copy the existing "workmem" estimate onto the Path and Plan fields. I could omit populating the Path field, and copy only to the Plan field, but it seemed like a good time to fill in the Path as well, in case future logic wants to make use of it. (None of the other 3 patches use the Path's "workmem" field.) So, Patch 1 is around 900 lines of code, total, but none of the changes are very serious, since they just copy existing estimates onto a field. So that's Patches 1, 3, and 4; and Patch 2 is just some local infrastructure so that Patches 3 and 4 can find all the query's SubPlans, so they can assign working memory to them. In summary: * Patch 1 copies the planner's "workmem" *estimate* to a new field on the Plan; * Patch 2 keeps track of SubPlans, so Patches 3 and 4 can assign workmem to them; * Patch 3 copies the "work_mem" *limit* GUC to a new field on the Plan; and * Patch 4 is a demo / PoC / example of how an extension can override the Plan's "workmem_limit" field, so it doesn't need to go into core PostgreSQL. We don't need test changes, but the code is pretty easy to test, so, for completeness, I added a new unit test, which is basically just a collection of queries copied from existing unit tests, displayed using the new "EXPLAIN (work_mem on)" option. And the "test changes" in Patch 3 just add "work mem limit" to the output -- it's the same test as Patch 1, just showing more information thanks to Patch 3. Overall, the changes made to PostgreSQL are minimal, but they are spread across multiple files... because (a) costing is spread across multiple files (costsize.c, pathnode.c, and createplan.c -- and also subselect.c, etc.); and (b) query execution is spread across multiple files (e.g., nodeSort.c, nodeHash.c, etc.). Every time we cost a Path or Plan that uses workmem, Patch 1 needs to copy the Path/Plan's workmem estimate onto the new field. And every time an exec node uses workmem, Patch 3 needs to read the workmem limit off the new field. So looking at the number of files touched overstates the size of the patches. Thanks, James
pgsql-hackers by date: