Home > mailing lists

Re: generic plans and "initial" pruning - Mailing list pgsql-hackers

From	Amit Langote
Subject	Re: generic plans and "initial" pruning
Date	July 13, 2023 12:58:38
Msg-id	CA+HiwqEDbf=+s73hAF0PigWORRx+YWwbCQtuuWtHzc3ko_DGpw@mail.gmail.com Whole thread Raw
In response to	Re: generic plans and "initial" pruning (Amit Langote <amitlangote09@gmail.com>)
Responses	Re: generic plans and "initial" pruning
List	pgsql-hackers

Tree view

On Thu, Jul 6, 2023 at 11:29 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Mon, Jul 3, 2023 at 10:27 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> > > On 8 Jun 2023, at 16:23, Amit Langote <amitlangote09@gmail.com> wrote:
> > > Here is a new version.
> >
> > The local planstate variable in the hunk below is shadowing the function
> > parameter planstate which cause a compiler warning:
>
> Thanks Daniel for the heads up.
>
> Attached new version fixes that and contains a few other notable
> changes.  Before going into the details of those changes, let me
> reiterate in broad strokes what the patch is trying to do.
>
> The idea is to move the locking of some tables referenced in a cached
> (generic) plan from plancache/GetCachedPlan() to the
> executor/ExecutorStart().  Specifically, the locking of inheritance
> child tables.  Why?  Because partition pruning with "initial pruning
> steps" contained in the Append/MergeAppend nodes may eliminate some
> child tables that need not have been locked to begin with, though the
> pruning can only occur during ExecutorStart().
>
> After applying this patch, GetCachedPlan() only locks the tables that
> are directly mentioned in the query to ensure that the
> analyzed-rewritten-but-unplanned query tree backing a given CachedPlan
> is still valid (cf RevalidateCachedQuery()), but not the tables in the
> CachedPlan that would have been added by the planner.  Tables in a
> CachePlan that would not be locked currently only include the
> inheritance child tables / partitions of the tables mentioned in the
> query.  This means that the plan trees in a given CachedPlan returned
> by GetCachedPlan() are only partially valid and are subject to
> invalidation because concurrent sessions can possibly modify the child
> tables referenced in them before ExecutorStart() gets around to
> locking them.  If the concurrent modifications do happen,
> ExecutorStart() is now equipped to detect them by way of noticing that
> the CachedPlan is invalidated and inform the caller to discard and
> recreate the CachedPlan.  This entails changing all the call sites of
> ExecutorStart() that pass it a plan tree from a CachedPlan to
> implement the replan-and-retry-execution loop.
>
> Given the above, ExecutorStart(), which has not needed so far to take
> any locks (except on indexes mentioned in IndexScans), now needs to
> lock child tables if executing a cached plan which contains them.  In
> the previous versions, the patch used a flag passed in
> EState.es_top_eflags to signal ExecGetRangeTableRelation() to lock the
> table.  The flag would be set in ExecInitAppend() and
> ExecInitMergeAppend() for the duration of the loop that initializes
> child subplans with the assumption that that's where the child tables
> would be opened.  But not all child subplans of Append/MergeAppend
> scan child tables (think UNION ALL queries), so this approach can
> result in redundant locking.  Worse, I needed to invent
> PlannedStmt.elidedAppendChildRelations to separately track child
> tables whose Scan nodes' parent Append/MergeAppend would be removed by
> setrefs.c in some cases.
>
> So, this new patch uses a flag in the RangeTblEntry itself to denote
> if the table is a child table instead of the above roundabout way.
> ExecGetRangeTableRelation() can simply look at the RTE to decide
> whether to take a lock or not.  I considered adding a new bool field,
> but noticed we already have inFromCl to track if a given RTE is for
> table/entity directly mentioned in the query or for something added
> behind-the-scenes into the range table as the field's description in
> parsenodes.h says.  RTEs for child tables are added behind-the-scenes
> by the planner and it makes perfect sense to me to mark their inFromCl
> as false.  I can't find anything that relies on the current behavior
> of inFromCl being set to the same value as the root inheritance parent
> (true).  Patch 0002 makes this change for child RTEs.
>
> A few other notes:
>
> * A parallel worker does ExecutorStart() without access to the
> CachedPlan that the leader may have gotten its plan tree from.  This
> means that parallel workers do not have the ability to detect plan
> tree invalidations.  I think that's fine, because if the leader would
> have been able to launch workers at all, it would also have gotten all
> the locks to protect the (portion of) the plan tree that the workers
> would be executing.  I had an off-list discussion about this with
> Robert and he mentioned his concern that each parallel worker would
> have its own view of which child subplans of a parallel Append are
> "valid" that depends on the result of its own evaluation of initial
> pruning.   So, there may be race conditions whereby a worker may try
> to execute plan nodes that are no longer valid, for example, if the
> partition a worker considers valid is not viewed as such by the leader
> and thus not locked.  I shared my thoughts as to why that sounds
> unlikely at [1], though maybe I'm a bit too optimistic?
>
> * For multi-query portals, you can't now do ExecutorStart()
> immediately followed by ExecutorRun() for each query in the portal,
> because ExecutorStart() may now fail to start a plan if it gets
> invalidated.   So PortalStart() now does ExecutorStart()s for all
> queries and remembers the QueryDescs for PortalRun() then to do
> ExecutorRun()s using.  A consequence of this is that
> CommandCounterIncrement() now must be done between the
> ExecutorStart()s of the individual plans in PortalStart() and not
> between the ExecutorRun()s in PortalRunMulti().  make check-world
> passes with this new arrangement, though I'm not entirely confident
> that there are no problems lurking.

In an absolutely brown-paper-bag moment, I realized that I had not
updated src/backend/executor/README to reflect the changes to the
executor's control flow that this patch makes.   That is, after
scrapping the old design back in January whose details *were*
reflected in the patches before that redesign.

Anyway, the attached fixes that.

Tom, do you think you have bandwidth in the near future to give this
another look?  I think I've addressed the comments that you had given
back in April, though as mentioned in the previous message, there may
still be some funny-looking aspects still remaining.  In any case, I
have no intention of pressing ahead with the patch without another
committer having had a chance to sign off on it.


--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

From: Matthias van de Meent
Date: 13 July 2023, 12:02:36
Subject: Potential memory leak in contrib/intarray's g_intbig_compress

From: Amit Kapila
Date: 13 July 2023, 13:06:49
Subject: Re: doc: clarify the limitation for logical replication when REPILICA IDENTITY is FULL

Re: generic plans and "initial" pruning - Mailing list pgsql-hackers

Attachment

Previous

Next