Re: why partition pruning doesn't work? - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: why partition pruning doesn't work? |
Date | |
Msg-id | ab10ce44-18b0-244a-c9c8-67d7020e2ba9@lab.ntt.co.jp Whole thread Raw |
In response to | Re: why partition pruning doesn't work? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: why partition pruning doesn't work?
|
List | pgsql-hackers |
On 2018/07/16 2:02, Tom Lane wrote: > Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: >> On 2018/06/19 2:05, Tom Lane wrote: >>> Or maybe what we should do is drop ExecLockNonLeafAppendTables/ >>> ExecOpenAppendPartitionedTables entirely and teach InitPlan to do it. > >> Hmm, for InitPlan to do what ExecOpenAppendPartitionedTables does, we'd >> have to have all the partitioned tables (contained in partitioned_rels >> fields of all Append/MergeAppend/ModifyTable nodes of a plan) also listed >> in a global list like rootResultRelations and nonleafResultRelations of >> PlannedStmt. > >> Attached updated patch. > > I've been looking at this patch, and while it's not unreasonable on its > own terms, I'm growing increasingly distressed at the ad-hoc and rather > duplicative nature of the data structures that have gotten stuck into > plan trees thanks to partitioning (the rootResultRelations and > nonleafResultRelations lists being prime examples). > > It struck me this morning that a whole lot of the complication here is > solely due to needing to identify the right type of relation lock to take > during executor startup, and that THAT WORK IS TOTALLY USELESS. I agree. IIUC, that's also the reason why PlannedStmt had to grow a resultRelations field in the first place. > In every > case, we must already hold a suitable lock before we ever get to the > executor; either one acquired during the parse/plan pipeline, or one > re-acquired by AcquireExecutorLocks in the case of a cached plan. > Otherwise it's entirely possible that the plan has been invalidated by > concurrent DDL --- and it's not the executor's job to detect that and > re-plan; that *must* have been done upstream. > > Moreover, it's important from a deadlock-avoidance standpoint that these > locks get acquired in the same order as they were acquired during the > initial parse/plan pipeline. I think it's reasonable to assume they were > acquired in RTE order in that stage, so AcquireExecutorLocks is OK. But, > if the current logic in the executor gets them in that order, it's both > non-obvious that it does so and horribly fragile if it does, seeing that > the responsibility for this is split across InitPlan, > ExecOpenScanRelation, and ExecLockNonLeafAppendTables. > > So I'm thinking that what we really ought to do here is simplify executor > startup to just open all rels with NoLock, and get rid of any supporting > data structures that turn out to have no other use. (David Rowley's > nearby patch to create a properly hierarchical executor data structure for > partitioning info is likely to tie into this too, by removing some other > vestigial uses of those lists.) I agree it would be nice to be able to get rid of those lists. > I think that this idea has been discussed in the past, and we felt at > the time that having the executor take its own locks was a good safety > measure, and a relatively cheap one since the lock manager is pretty > good at short-circuiting duplicative lock requests. But those are > certainly not free. Moreover, I'm not sure that this is really a > safety measure at all: if the executor were really taking any lock > not already held, it'd be masking a DDL hazard. > > To investigate this further, I made the attached not-meant-for-commit > hack to verify whether InitPlan and related executor startup functions > were actually taking any not-previously-held locks. I could only find > one such case: the parser always opens tables selected FOR UPDATE with > RowShareLock, but if we end up implementing the resulting row mark > with ROW_MARK_COPY, the executor is acquiring just AccessShareLock > (because ExecOpenScanRelation thinks it needs to acquire some lock). > The patch as presented hacks ExecOpenScanRelation to avoid that, and > it passes check-world. > > What we'd be better off doing, if we go this route, is to install an > assertion-build-only test that verifies during relation_open(NoLock) > that some kind of lock is already held on the rel. That would protect > not only the executor, but a boatload of existing places that open > rels with NoLock on the currently-unverified assumption that a lock is > already held. +1 > I'm also rather strongly tempted to add a field to RangeTblEntry > that records the appropriate lock strength to take, so that we don't > have to rely on keeping AcquireExecutorLocks' logic to decide on the > lock type in sync with whatever the parse/plan pipeline does. (One > could then imagine adding assertions in the executor that this field > shows a lock strength of at least X, in place of actually opening > the rel with X.) > > BTW, there'd be a lot to be said for having InitPlan just open all > the rels and build an array of Relation pointers that parallels the > RTE list, rather than doing heap_opens in random places elsewhere. +1 to this. Actually I had written such a patch in place of the one I posted on this thread, but wasn't confident enough that I had found and fixed all the places in the executor that would use the Relation pointer from there instead of fetching it themselves. Thanks, Amit
pgsql-hackers by date: