Home > mailing lists

Re: plan shape work - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: plan shape work
Date	September 25 16:10:54
Msg-id	CA+TgmobF3V0h_7o-4_LrSFLMXaETS_cAq4PyW+0Fxc6EWOKgLA@mail.gmail.com Whole thread Raw
In response to	Re: plan shape work (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: plan shape work
List	pgsql-hackers

Tree view

On Wed, Sep 24, 2025 at 6:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't think so.  We do not have a nice story on marking Node fields
> const: it's very unclear for example what consequences that ought to
> have for copyObject().  Maybe somebody will tackle that issue someday,
> but it's not something to touch casually in a patch with other
> objectives.  So I don't think we can make the plan_name fields const.
> The best solution I think is to make choose_plan_name() take a const
> string and return a non-const one.  The attached v10-0001 is just like
> your v9-0001 except for doing the const stuff this way.  I chose to
> fix the impedance mismatch within choose_plan_name() by having it
> pstrdup when it wants to just return the "name" string, but you could
> make a case for holding your nose and just casting away const there.

Yeah, these are the kinds of trade-offs I never know how to make. I
have a little difficulty believing that it's worth a pstrdup() to have
the benefit of marking things const, but maybe it is, and the
difference is probably microscopic either way. It's not like we're
going to be faced with many planning problems involving gigantic
numbers of subqueries. If this were a hot code path we'd want to think
harder.

> > - You (Tom) also asked why not print InitPlan/SubPlan wherever we
> > refer to subplans, so this version restores that behavior.
>
> Thanks.  I'm good with the output now (modulo the bug described
> below).  Someone could potentially argue that this exposes more
> of the internals than we really ought to, such as the difference
> between expr and multiexpr SubLinks, but I'm okay with that.

I actually thought it looked kind of nice exposing some of that
detail, but it's certainly a judgement call.

> Aside from the const issue, something I don't really like at the
> coding level is the use of an "allroots" list.  One reason is that
> it's partially redundant with the adjacent "subroots" list, but
> a bigger one is that we have transient roots that shouldn't be
> in there.  An example here is pull_up_simple_subquery: it builds
> a clone of the query's PlannerInfo to help it use various
> infrastructure along the way to flattening the subquery, but
> that clone is not referenced anymore after the function exits.
> You were putting that into allroots, which seems to me to be
> a fundamental error, even more so because it went in with the
> same plan_name as the root it was cloned from.
>
> I think a better idea is to keep a list of just the subplan
> names that we've assigned so far.  That has a far clearer
> charter, plus it can be updated immediately by choose_plan_name()
> instead of relying on the caller to do the right thing later.
> I coded this up, and was rather surprised to find that it changed
> some regression outputs.  On investigation, that's because
> build_minmax_path() was actually doing the wrong thing later:
> it was putting the wrong root into allroots, so that "minmax_1"
> never became assigned and could be re-used later.

Ooph, that's embarrassing. I think the reason that I ended up making a
list of the roots themselves rather than the strings is that I was
thinking that everything in this data structure would need to be a
node, and I didn't want to cons up String nodes for every list
element. Then later I marked that structure member read_write_ignore
and never stopped to think that maybe then we didn't need nodes at
all. So, in short, I think this is a great solution and thanks a ton
for putting in the legwork to figure it out.

> I also observed that SS_process_ctes() was not using
> choose_plan_name() but simply assigning the user-written CTE
> name.  I believe it's possible to use the same CTE name in
> different parts of a query tree, so this fails to achieve
> the stated purpose of making the names unique.

Ouch, that's another good catch. Also, even if the CTEs had to be
unique among themselves, there could be a collision between a CTE name
and a generated name.

> I'm still a little bit uncomfortable about whether
> it's okay for pull_up_simple_subquery() to just do
>
> +       subroot->plan_name = root->plan_name;
>
> rather than giving some other name to the transient subroot.
> I think it's okay because we are not making any meaningful planning
> decisions during the life of the subroot, just seeing if we can
> transform the subquery into a form that allows it to be pulled up.
> But you might think differently.  Perhaps a potential compromise
> is to set the transient subroot's plan_name to NULL instead?

I don't like NULL. Imagine that some meaningful planning decision does
get made during the life of the subroot, and imagine further that some
hook is called by means of which some extension can influence that
decision. What plan_name do we want that extension to see? I think
there's some argument for letting it see the plan name of the parent
plan into which we're thinking about inlining the subquery, which I
believe is the effect of the current coding, and there's perhaps also
an argument for having a wholly new plan name there just to really
identify that particular decision clearly, but if we put NULL, that's
the name we use for the topmost query level. If we changed things so
that the top query level gets called "main" or some other constant
string, then it would make sense to use NULL as a sentinel value here
to mean "undefined," but I kind of think we probably don't want to go
there.

> Anyway, v10-0002 is a delta patch to use a list of subplan
> names instead of "allroots", and there are a couple of trivial
> cosmetic changes too.

OK, I'll try out these versions and let you know what I find out. Thanks much.

--
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Bertrand Drouvot
Date: 25 September, 16:01:55
Subject: Re: Report bytes and transactions actually sent downtream

From: Daniil Davydov
Date: 25 September, 16:15:50
Subject: Re: Fix bug with accessing to temporary tables of other sessions

Re: plan shape work - Mailing list pgsql-hackers

Previous

Next