Re: generic plans and "initial" pruning - Mailing list pgsql-hackers

From Robert Haas
Subject Re: generic plans and "initial" pruning
Date
Msg-id CA+Tgmoakey5yO+TTp0Gntz5MdnMLhd+Ei-+HWuCszHJeVU_YVA@mail.gmail.com
Whole thread Raw
In response to Re: generic plans and "initial" pruning  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: generic plans and "initial" pruning  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
On Mon, Mar 14, 2022 at 3:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> ... like EXPLAIN, for example?

Exactly! I think that's the foremost example, but extension modules
like auto_explain or even third-party extensions are also a risk. I
think there was some discussion of this previously.

> If "pruning" means physical removal from the plan tree, then it's
> probably all right.  However, it looks to me like that doesn't
> actually happen, or at least doesn't happen till much later, so
> there's room for worry about a disconnect between what plancache.c
> has verified and what executor startup will try to touch.  As you
> say, in the absence of any bugs, that's not a problem ... but if
> there are such bugs, tracking them down would be really hard.

Surgery on the plan would violate the general principle that plans are
read only once constructed. I think the idea ought to be to pass a
secondary data structure around with the plan that defines which parts
you must ignore. Any code that fails to use that other data structure
in the appropriate manner gets defined to be buggy and has to be fixed
by making it follow the new rules.

> What I am skeptical about is that this work actually accomplishes
> anything under real-world conditions.  That's because if pruning would
> save enough to make skipping the lock-acquisition phase worth the
> trouble, the plan cache is almost certainly going to decide it should
> be using a custom plan not a generic plan.  Now if we had a better
> cost model (or, indeed, any model at all) for run-time pruning effects
> then maybe that situation could be improved.  I think we'd be better
> served to worry about that end of it before we spend more time making
> the executor even less predictable.

I don't agree with that analysis, because setting plan_cache_mode is
not uncommon. Even if that GUC didn't exist, I'm pretty sure there are
cases where the planner naturally falls into a generic plan anyway,
even though pruning is happening. But as it is, the GUC does exist,
and people use it. Consequently, while I'd love to see something done
about the costing side of things, I do not accept that all other
improvements should wait for that to happen.

> Also, while I've not spent much time at all reading this patch,
> it seems rather desperately undercommented, and a lot of the
> new names are unintelligible.  In particular, I suspect that the
> patch is significantly redesigning when/where run-time pruning
> happens (unless it's just letting that be run twice); but I don't
> see any documentation or name changes suggesting where that
> responsibility is now.

I am sympathetic to that concern. I spent a while staring at a
baffling comment in 0001 only to discover it had just been moved from
elsewhere. I really don't feel that things in this are as clear as
they could be -- although I hasten to add that I respect the people
who have done work in this area previously and am grateful for what
they did. It's been a huge benefit to the project in spite of the
bumps in the road. Moreover, this isn't the only code in PostgreSQL
that needs improvement, or the worst. That said, I do think there are
problems. I don't yet have a position on whether this patch is making
that better or worse.

That said, I believe that the core idea of the patch is to optionally
perform pruning before we acquire locks or spin up the main executor
and then remember the decisions we made. If once the main executor is
spun up we already made those decisions, then we must stick with what
we decided. If not, we make those pruning decisions at the same point
we do currently - more or less on demand, at the point when we'd need
to know whether to descend that branch of the plan tree or not. I
think this scheme comes about because there are a couple of different
interfaces to the parameterized query stuff, and in some code paths we
have the values early enough to use them for pre-pruning, and in
others we don't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: generic plans and "initial" pruning
Next
From: Robert Haas
Date:
Subject: Re: Query about time zone patterns in to_char