Re: why partition pruning doesn't work? - Mailing list pgsql-hackers

From Amit Langote
Subject Re: why partition pruning doesn't work?
Date
Msg-id ab10ce44-18b0-244a-c9c8-67d7020e2ba9@lab.ntt.co.jp
Whole thread Raw
In response to Re: why partition pruning doesn't work?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: why partition pruning doesn't work?
List pgsql-hackers
On 2018/07/16 2:02, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
>> On 2018/06/19 2:05, Tom Lane wrote:
>>> Or maybe what we should do is drop ExecLockNonLeafAppendTables/
>>> ExecOpenAppendPartitionedTables entirely and teach InitPlan to do it.
> 
>> Hmm, for InitPlan to do what ExecOpenAppendPartitionedTables does, we'd
>> have to have all the partitioned tables (contained in partitioned_rels
>> fields of all Append/MergeAppend/ModifyTable nodes of a plan) also listed
>> in a global list like rootResultRelations and nonleafResultRelations of
>> PlannedStmt.
> 
>> Attached updated patch.
> 
> I've been looking at this patch, and while it's not unreasonable on its
> own terms, I'm growing increasingly distressed at the ad-hoc and rather
> duplicative nature of the data structures that have gotten stuck into
> plan trees thanks to partitioning (the rootResultRelations and
> nonleafResultRelations lists being prime examples).
> 
> It struck me this morning that a whole lot of the complication here is
> solely due to needing to identify the right type of relation lock to take
> during executor startup, and that THAT WORK IS TOTALLY USELESS.

I agree.  IIUC, that's also the reason why PlannedStmt had to grow a
resultRelations field in the first place.

> In every
> case, we must already hold a suitable lock before we ever get to the
> executor; either one acquired during the parse/plan pipeline, or one
> re-acquired by AcquireExecutorLocks in the case of a cached plan.
> Otherwise it's entirely possible that the plan has been invalidated by
> concurrent DDL --- and it's not the executor's job to detect that and
> re-plan; that *must* have been done upstream.
> 
> Moreover, it's important from a deadlock-avoidance standpoint that these
> locks get acquired in the same order as they were acquired during the
> initial parse/plan pipeline.  I think it's reasonable to assume they were
> acquired in RTE order in that stage, so AcquireExecutorLocks is OK.  But,
> if the current logic in the executor gets them in that order, it's both
> non-obvious that it does so and horribly fragile if it does, seeing that
> the responsibility for this is split across InitPlan,
> ExecOpenScanRelation, and ExecLockNonLeafAppendTables.
> 
> So I'm thinking that what we really ought to do here is simplify executor
> startup to just open all rels with NoLock, and get rid of any supporting
> data structures that turn out to have no other use.  (David Rowley's
> nearby patch to create a properly hierarchical executor data structure for
> partitioning info is likely to tie into this too, by removing some other
> vestigial uses of those lists.)

I agree it would be nice to be able to get rid of those lists.

> I think that this idea has been discussed in the past, and we felt at
> the time that having the executor take its own locks was a good safety
> measure, and a relatively cheap one since the lock manager is pretty
> good at short-circuiting duplicative lock requests.  But those are
> certainly not free.  Moreover, I'm not sure that this is really a
> safety measure at all: if the executor were really taking any lock
> not already held, it'd be masking a DDL hazard.
> 
> To investigate this further, I made the attached not-meant-for-commit
> hack to verify whether InitPlan and related executor startup functions
> were actually taking any not-previously-held locks.  I could only find
> one such case: the parser always opens tables selected FOR UPDATE with
> RowShareLock, but if we end up implementing the resulting row mark
> with ROW_MARK_COPY, the executor is acquiring just AccessShareLock
> (because ExecOpenScanRelation thinks it needs to acquire some lock).
> The patch as presented hacks ExecOpenScanRelation to avoid that, and
> it passes check-world.
> 
> What we'd be better off doing, if we go this route, is to install an
> assertion-build-only test that verifies during relation_open(NoLock)
> that some kind of lock is already held on the rel.  That would protect
> not only the executor, but a boatload of existing places that open
> rels with NoLock on the currently-unverified assumption that a lock is
> already held.

+1

> I'm also rather strongly tempted to add a field to RangeTblEntry
> that records the appropriate lock strength to take, so that we don't
> have to rely on keeping AcquireExecutorLocks' logic to decide on the
> lock type in sync with whatever the parse/plan pipeline does.  (One
> could then imagine adding assertions in the executor that this field
> shows a lock strength of at least X, in place of actually opening
> the rel with X.)
> 
> BTW, there'd be a lot to be said for having InitPlan just open all
> the rels and build an array of Relation pointers that parallels the
> RTE list, rather than doing heap_opens in random places elsewhere.

+1 to this.  Actually I had written such a patch in place of the one I
posted on this thread, but wasn't confident enough that I had found and
fixed all the places in the executor that would use the Relation pointer
from there instead of fetching it themselves.

Thanks,
Amit



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Bug in gin insert redo code path during re-compression of emptygin data leaf pages
Next
From: Dean Rasheed
Date:
Subject: Re: PG 10: could not generate random cancel key