Re: generic plans and "initial" pruning - Mailing list pgsql-hackers

From Amit Langote
Subject Re: generic plans and "initial" pruning
Date
Msg-id CA+HiwqFhQ8tLMLQAWYRWiBQUrWSLM8qhVzJ4B5jWr=orUXfSyA@mail.gmail.com
Whole thread Raw
In response to Re: generic plans and "initial" pruning  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: generic plans and "initial" pruning
List pgsql-hackers
On Fri, Jan 20, 2023 at 4:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I spent some time re-reading this whole thread, and the more I read
> the less happy I got.

Thanks a lot for your time on this.

>  We are adding a lot of complexity and introducing
> coding hazards that will surely bite somebody someday.  And after awhile
> I had what felt like an epiphany: the whole problem arises because the
> system is wrongly factored.  We should get rid of AcquireExecutorLocks
> altogether, allowing the plancache to hand back a generic plan that
> it's not certain of the validity of, and instead integrate the
> responsibility for acquiring locks into executor startup.  It'd have
> to be optional there, since we don't need new locks in the case of
> executing a just-planned plan; but we can easily add another eflags
> bit (EXEC_FLAG_GET_LOCKS or so).  Then there has to be a convention
> whereby the ExecInitNode traversal can return an indicator that
> "we failed because the plan is stale, please make a new plan".

Interesting.  The current implementation relies on
PlanCacheRelCallback() marking a generic CachedPlan as invalid, so
perhaps there will have to be some sharing of state between the
plancache and the executor for this to work?

> There are a couple reasons why this feels like a good idea:
>
> * There's no need for worry about keeping the locking decisions in sync
> with what executor startup does.
>
> * We don't need to add the overhead proposed in the current patch to
> pass forward data about what got locked/pruned.  While that overhead
> is hopefully less expensive than the locks it saved acquiring, it's
> still overhead (and in some cases the patch will fail to save acquiring
> any locks, making it certainly a net negative).
>
> * In a successfully built execution state tree, there will simply
> not be any nodes corresponding to pruned-away, never-locked subplans.
> As long as code like EXPLAIN follows the state tree and doesn't poke
> into plan nodes that have no matching state, it's secure against the
> sort of problems that Robert worried about upthread.

I think this is true with the patch as proposed too, but I was still a
bit worried about what an ExecutorStart_hook may be doing with an
uninitialized plan tree.  Maybe we're mandating that the hook must
call standard_ExecutorStart() and only work with the finished
PlanState tree?

> While I've not attempted to write any code for this, I can also
> think of a few issues that'd have to be resolved:
>
> * We'd be pushing the responsibility for looping back and re-planning
> out to fairly high-level calling code.  There are only half a dozen
> callers of GetCachedPlan, so there's not that many places to be
> touched; but in some of those places the subsequent executor-start call
> is not close by, so that the necessary refactoring might be pretty
> painful.  I doubt there's anything insurmountable, but we'd definitely
> be changing some fundamental APIs.

Yeah.  I suppose mostly the same place that the current patch is
touching to pass around the PartitionPruneResult nodes.

> * In some cases (views, at least) we need to acquire lock on relations
> that aren't directly reflected anywhere in the plan tree.  So there'd
> have to be a separate mechanism for getting those locks and rechecking
> validity afterward.  A list of relevant relation OIDs might be enough
> for that.

Hmm, a list of only the OIDs wouldn't preserve the lock mode, so maybe
a list or bitmapset of the RTIs, something along the lines of
PlannedStmt.minLockRelids in the patch?

It perhaps even makes sense to make a special list in PlannedStmt for
only the views?

> * We currently do ExecCheckPermissions() before initializing the
> plan state tree.  It won't do to check permissions on relations we
> haven't yet locked, so that responsibility would have to be moved.
> Maybe that could also be integrated into the initialization recursion?
> Not sure.

Ah, I remember mentioning moving that into ExecGetRangeTableRelation()
[1], but I guess that misses relations that are not referenced in the
plan tree, such as views.  Though maybe that's not a problem if we
track views separately as mentioned above.

> * In the existing usage of AcquireExecutorLocks, if we do decide
> that the plan is stale then we are able to release all the locks
> we got before we go off and replan.  I'm not certain if that behavior
> needs to be preserved, but if it does then that would require some
> additional bookkeeping in the executor.

I think maybe we'll want to continue to release the existing locks,
because if we don't, it's possible we may keep some locks uselessly if
replanning might lock a different set of relations.

> * This approach is optimizing on the assumption that we usually
> won't need to replan, because if we do then we might waste a fair
> amount of executor startup overhead before discovering we have
> to throw all that state away.  I think that's clearly the right
> way to bet, but perhaps somebody else has a different view.

Not sure if you'd like, because it would still keep the
PartitionPruneResult business, but this will be less of a problem if
we do the initial pruning at the beginning of InitPlan(), followed by
locking, before doing anything else.  We would have initialized the
QueryDesc and the EState, but only minimally.  That also keeps the
PartitionPruneResult business local to the executor.

Would you like me to hack up a PoC or are you already on that?

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/CA%2BHiwqG7ZruBmmih3wPsBZ4s0H2EhywrnXEduckY5Hr3fWzPWA%40mail.gmail.com



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Replace PROC_QUEUE / SHM_QUEUE with ilist.h
Next
From: songjinzhou
Date:
Subject: Re: Re: Support plpgsql multi-range in conditional control