Home > mailing lists

Re: why partition pruning doesn't work? - Mailing list pgsql-hackers

From	Amit Langote
Subject	Re: why partition pruning doesn't work?
Date	June 15, 2018 15:30:25
Msg-id	2506aae4-d2b1-aefe-31e2-42ff1cd6c1e5@lab.ntt.co.jp Whole thread Raw
In response to	Re: why partition pruning doesn't work? (David Rowley <david.rowley@2ndquadrant.com>)
List	pgsql-hackers

Tree view

On 2018/06/14 20:23, David Rowley wrote:
> On 14 June 2018 at 19:17, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> I had sent a patch to try to get rid of the open(NoLock) there a couple of
>> months ago [1].  The idea was to both lock and open the relation in
>> ExecNonLeafAppendTables, which is the first time all partitioned tables in
>> a given Append node are locked for execution.  Also, the patch makes it a
>> responsibility of ExecEndAppend to release the relcache pins, so the
>> recently added ExecDestroyPartitionPruneState would not be needed.
> 
> Robert and I briefly discussed something more complete a while back.
> Not much detail was talked about, but the basic idea was to store the
> Relation somewhere in the planner an executor that we could lookup by
> rel index rather than having to relation_open all the time.
> 
> I just had a very quick look around and thought maybe RangeTblEntry
> might be a good spot to store the Relation and current lock level that
> we hold on that relation. This means that we can use
> PlannerInfo->simple_rte_array in the planner and
> EState->es_range_table in the executor. The latter of those would be
> much nicer if we built an array instead of keeping the list (can
> probably build that during InitPlan()).  We could then get hold of a
> Relation object when needed without having to do relation_open(...,
> NoLock).
> 
> Code which currently looks like:
> 
> reloid = getrelid(scanrelid, estate->es_range_table);
> rel = heap_open(reloid, lockmode);
> 
> In places like ExecOpenScanRelation, could be replaced with some
> wrapper function call like: rte_open_rel(estate, scanrelid, lockmode);
> 
> This could also be used to ERROR out if rte_open_rel() was done
> initially with NoLock. Right now there are so many places that we do
> relation_open(..., NoLock) and write a comment /* Lock already taken
> by ... */, which we hope is always true.
> 
> If the Relation is already populated in the RangeTblEntry then the
> function would just return that, otherwise, we'd just look inside the
> RangeTblEntry for the relation Oid and do a heap_open on it, and store
> the lockmode that's held. We could then also consider getting of a
> bunch of fields like boundinfo and nparts from RelOptInfo.
> 
> We could also perhaps do a WARNING about lock upgrade hazards in
> there, at least maybe in debug builds.
> 
> However, I only spent about 10 mins looking into this, there may be
> some giant holes in the idea.  It would need much more research.

Will also need to consider caching of plans, that is of PlannedStmts,
which in turn contain RangeTblEntry nodes.  The idea that we can open a
relation at the beginning or even before planning and keep using the
Relation pointer all the way to the end of execution of a query seems hard
to realize.  As others pointed out, we need to think of planning and
execution as separate phases and will have to open/close relations separately.

Thanks,
Amit

pgsql-hackers by date:

From: Masahiko Sawada
Date: 15 June 2018, 15:11:08
Subject: Re: Possible bug in logical replication.

From: Andrew Dunstan
Date: 15 June 2018, 17:06:50
Subject: Re: automating perl compile time checking

Re: why partition pruning doesn't work? - Mailing list pgsql-hackers

Previous

Next