Re: generic plans and "initial" pruning - Mailing list pgsql-hackers

From Tom Lane
Subject Re: generic plans and "initial" pruning
Date
Msg-id 4191508.1674157166@sss.pgh.pa.us
Whole thread Raw
In response to Re: generic plans and "initial" pruning  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: generic plans and "initial" pruning
Re: generic plans and "initial" pruning
List pgsql-hackers
I spent some time re-reading this whole thread, and the more I read
the less happy I got.  We are adding a lot of complexity and introducing
coding hazards that will surely bite somebody someday.  And after awhile
I had what felt like an epiphany: the whole problem arises because the
system is wrongly factored.  We should get rid of AcquireExecutorLocks
altogether, allowing the plancache to hand back a generic plan that
it's not certain of the validity of, and instead integrate the
responsibility for acquiring locks into executor startup.  It'd have
to be optional there, since we don't need new locks in the case of
executing a just-planned plan; but we can easily add another eflags
bit (EXEC_FLAG_GET_LOCKS or so).  Then there has to be a convention
whereby the ExecInitNode traversal can return an indicator that
"we failed because the plan is stale, please make a new plan".

There are a couple reasons why this feels like a good idea:

* There's no need for worry about keeping the locking decisions in sync
with what executor startup does.

* We don't need to add the overhead proposed in the current patch to
pass forward data about what got locked/pruned.  While that overhead
is hopefully less expensive than the locks it saved acquiring, it's
still overhead (and in some cases the patch will fail to save acquiring
any locks, making it certainly a net negative).

* In a successfully built execution state tree, there will simply
not be any nodes corresponding to pruned-away, never-locked subplans.
As long as code like EXPLAIN follows the state tree and doesn't poke
into plan nodes that have no matching state, it's secure against the
sort of problems that Robert worried about upthread.

While I've not attempted to write any code for this, I can also
think of a few issues that'd have to be resolved:

* We'd be pushing the responsibility for looping back and re-planning
out to fairly high-level calling code.  There are only half a dozen
callers of GetCachedPlan, so there's not that many places to be
touched; but in some of those places the subsequent executor-start call
is not close by, so that the necessary refactoring might be pretty
painful.  I doubt there's anything insurmountable, but we'd definitely
be changing some fundamental APIs.

* In some cases (views, at least) we need to acquire lock on relations
that aren't directly reflected anywhere in the plan tree.  So there'd
have to be a separate mechanism for getting those locks and rechecking
validity afterward.  A list of relevant relation OIDs might be enough
for that.

* We currently do ExecCheckPermissions() before initializing the
plan state tree.  It won't do to check permissions on relations we
haven't yet locked, so that responsibility would have to be moved.
Maybe that could also be integrated into the initialization recursion?
Not sure.

* In the existing usage of AcquireExecutorLocks, if we do decide
that the plan is stale then we are able to release all the locks
we got before we go off and replan.  I'm not certain if that behavior
needs to be preserved, but if it does then that would require some
additional bookkeeping in the executor.

* This approach is optimizing on the assumption that we usually
won't need to replan, because if we do then we might waste a fair
amount of executor startup overhead before discovering we have
to throw all that state away.  I think that's clearly the right
way to bet, but perhaps somebody else has a different view.

Thoughts?

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Extracting cross-version-upgrade knowledge from buildfarm client
Next
From: Nathan Bossart
Date:
Subject: Re: almost-super-user problems that we haven't fixed yet