Re: generic plans and "initial" pruning - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: generic plans and "initial" pruning |
Date | |
Msg-id | CA+HiwqEDbf=+s73hAF0PigWORRx+YWwbCQtuuWtHzc3ko_DGpw@mail.gmail.com Whole thread Raw |
In response to | Re: generic plans and "initial" pruning (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: generic plans and "initial" pruning
|
List | pgsql-hackers |
On Thu, Jul 6, 2023 at 11:29 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Mon, Jul 3, 2023 at 10:27 PM Daniel Gustafsson <daniel@yesql.se> wrote: > > > On 8 Jun 2023, at 16:23, Amit Langote <amitlangote09@gmail.com> wrote: > > > Here is a new version. > > > > The local planstate variable in the hunk below is shadowing the function > > parameter planstate which cause a compiler warning: > > Thanks Daniel for the heads up. > > Attached new version fixes that and contains a few other notable > changes. Before going into the details of those changes, let me > reiterate in broad strokes what the patch is trying to do. > > The idea is to move the locking of some tables referenced in a cached > (generic) plan from plancache/GetCachedPlan() to the > executor/ExecutorStart(). Specifically, the locking of inheritance > child tables. Why? Because partition pruning with "initial pruning > steps" contained in the Append/MergeAppend nodes may eliminate some > child tables that need not have been locked to begin with, though the > pruning can only occur during ExecutorStart(). > > After applying this patch, GetCachedPlan() only locks the tables that > are directly mentioned in the query to ensure that the > analyzed-rewritten-but-unplanned query tree backing a given CachedPlan > is still valid (cf RevalidateCachedQuery()), but not the tables in the > CachedPlan that would have been added by the planner. Tables in a > CachePlan that would not be locked currently only include the > inheritance child tables / partitions of the tables mentioned in the > query. This means that the plan trees in a given CachedPlan returned > by GetCachedPlan() are only partially valid and are subject to > invalidation because concurrent sessions can possibly modify the child > tables referenced in them before ExecutorStart() gets around to > locking them. If the concurrent modifications do happen, > ExecutorStart() is now equipped to detect them by way of noticing that > the CachedPlan is invalidated and inform the caller to discard and > recreate the CachedPlan. This entails changing all the call sites of > ExecutorStart() that pass it a plan tree from a CachedPlan to > implement the replan-and-retry-execution loop. > > Given the above, ExecutorStart(), which has not needed so far to take > any locks (except on indexes mentioned in IndexScans), now needs to > lock child tables if executing a cached plan which contains them. In > the previous versions, the patch used a flag passed in > EState.es_top_eflags to signal ExecGetRangeTableRelation() to lock the > table. The flag would be set in ExecInitAppend() and > ExecInitMergeAppend() for the duration of the loop that initializes > child subplans with the assumption that that's where the child tables > would be opened. But not all child subplans of Append/MergeAppend > scan child tables (think UNION ALL queries), so this approach can > result in redundant locking. Worse, I needed to invent > PlannedStmt.elidedAppendChildRelations to separately track child > tables whose Scan nodes' parent Append/MergeAppend would be removed by > setrefs.c in some cases. > > So, this new patch uses a flag in the RangeTblEntry itself to denote > if the table is a child table instead of the above roundabout way. > ExecGetRangeTableRelation() can simply look at the RTE to decide > whether to take a lock or not. I considered adding a new bool field, > but noticed we already have inFromCl to track if a given RTE is for > table/entity directly mentioned in the query or for something added > behind-the-scenes into the range table as the field's description in > parsenodes.h says. RTEs for child tables are added behind-the-scenes > by the planner and it makes perfect sense to me to mark their inFromCl > as false. I can't find anything that relies on the current behavior > of inFromCl being set to the same value as the root inheritance parent > (true). Patch 0002 makes this change for child RTEs. > > A few other notes: > > * A parallel worker does ExecutorStart() without access to the > CachedPlan that the leader may have gotten its plan tree from. This > means that parallel workers do not have the ability to detect plan > tree invalidations. I think that's fine, because if the leader would > have been able to launch workers at all, it would also have gotten all > the locks to protect the (portion of) the plan tree that the workers > would be executing. I had an off-list discussion about this with > Robert and he mentioned his concern that each parallel worker would > have its own view of which child subplans of a parallel Append are > "valid" that depends on the result of its own evaluation of initial > pruning. So, there may be race conditions whereby a worker may try > to execute plan nodes that are no longer valid, for example, if the > partition a worker considers valid is not viewed as such by the leader > and thus not locked. I shared my thoughts as to why that sounds > unlikely at [1], though maybe I'm a bit too optimistic? > > * For multi-query portals, you can't now do ExecutorStart() > immediately followed by ExecutorRun() for each query in the portal, > because ExecutorStart() may now fail to start a plan if it gets > invalidated. So PortalStart() now does ExecutorStart()s for all > queries and remembers the QueryDescs for PortalRun() then to do > ExecutorRun()s using. A consequence of this is that > CommandCounterIncrement() now must be done between the > ExecutorStart()s of the individual plans in PortalStart() and not > between the ExecutorRun()s in PortalRunMulti(). make check-world > passes with this new arrangement, though I'm not entirely confident > that there are no problems lurking. In an absolutely brown-paper-bag moment, I realized that I had not updated src/backend/executor/README to reflect the changes to the executor's control flow that this patch makes. That is, after scrapping the old design back in January whose details *were* reflected in the patches before that redesign. Anyway, the attached fixes that. Tom, do you think you have bandwidth in the near future to give this another look? I think I've addressed the comments that you had given back in April, though as mentioned in the previous message, there may still be some funny-looking aspects still remaining. In any case, I have no intention of pressing ahead with the patch without another committer having had a chance to sign off on it. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: