Re: generic plans and "initial" pruning - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: generic plans and "initial" pruning |
Date | |
Msg-id | CA+HiwqGe0W2C+SrKcxrk-r4JjO0sCfL581p1M2bzr_LSrzGn+g@mail.gmail.com Whole thread Raw |
In response to | Re: generic plans and "initial" pruning (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: generic plans and "initial" pruning
|
List | pgsql-hackers |
On Thu, Feb 2, 2023 at 11:49 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Fri, Jan 27, 2023 at 4:01 PM Amit Langote <amitlangote09@gmail.com> wrote: > > I didn't actually go with calling the plancache on every lock taken on > > a relation, that is, in ExecGetRangeTableRelation(). One thing about > > doing it that way that I didn't quite like (or didn't see a clean > > enough way to code) is the need to complicate the ExecInitNode() > > traversal for handling the abrupt suspension of the ongoing setup of > > the PlanState tree. > > OK, I gave this one more try and attached is what I came up with. > > This adds a ExecPlanStillValid(), which is called right after anything > that may in turn call ExecGetRangeTableRelation() which has been > taught to lock a relation if EXEC_FLAG_GET_LOCKS has been passed in > EState.es_top_eflags. That includes all ExecInitNode() calls, and a > few other functions that call ExecGetRangeTableRelation() directly, > such as ExecOpenScanRelation(). If ExecPlanStillValid() returns > false, that is, if EState.es_cachedplan is found to have been > invalidated after a lock being taken by ExecGetRangeTableRelation(), > whatever funcion called it must return immediately and so must its > caller and so on. ExecEndPlan() seems to be able to clean up after a > partially finished attempt of initializing a PlanState tree in this > way. Maybe my preliminary testing didn't catch cases where pointers > to resources that are normally put into the nodes of a PlanState tree > are now left dangling, because a partially built PlanState tree is not > accessible to ExecEndPlan; QueryDesc.planstate would remain NULL in > such cases. Maybe there's only es_tupleTable and es_relations that > needs to be explicitly released and the rest is taken care of by > resetting the ExecutorState context. In the attached updated patch, I've made the functions that check ExecPlanStillValid() to return NULL (if returning something) instead of returning partially initialized structs. Those partially initialized structs were not being subsequently looked at anyway. > On testing, I'm afraid we're going to need something like > src/test/modules/delay_execution to test that concurrent changes to > relation(s) in PlannedStmt.relationOids that occur somewhere between > RevalidateCachedQuery() and InitPlan() result in the latter to be > aborted and that it is handled correctly. It seems like it is only > the locking of partitions (that are not present in an unplanned Query > and thus not protected by AcquirePlannerLocks()) that can trigger > replanning of a CachedPlan, so any tests we write should involve > partitions. Should this try to test as many plan shapes as possible > though given the uncertainty around ExecEndPlan() robustness or should > manual auditing suffice to be sure that nothing's broken? I've added a test case under src/modules/delay_execution by adding a new ExecutorStart_hook that works similarly as delay_execution_planner(). The test works by allowing a concurrent session to drop an object being referenced in a cached plan being initialized while the ExecutorStart_hook waits to get an advisory lock. The concurrent drop of the referenced object is detected during ExecInitNode() and thus triggers replanning of the cached plan. I also fixed a bug in the ExplainExecuteQuery() while testing and some comments. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: