Re: plan shape work - Mailing list pgsql-hackers
| From | Robert Haas |
|---|---|
| Subject | Re: plan shape work |
| Date | |
| Msg-id | CA+TgmoZ9yas=7tHLG86qJJnLc+5ORuOM02V+07knB4j5O1_usA@mail.gmail.com Whole thread Raw |
| In response to | Re: plan shape work (Tom Lane <tgl@sss.pgh.pa.us>) |
| List | pgsql-hackers |
On Wed, Sep 24, 2025 at 6:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Aside from the const issue, something I don't really like at the > coding level is the use of an "allroots" list. One reason is that > it's partially redundant with the adjacent "subroots" list, but > a bigger one is that we have transient roots that shouldn't be > in there. An example here is pull_up_simple_subquery: it builds > a clone of the query's PlannerInfo to help it use various > infrastructure along the way to flattening the subquery, but > that clone is not referenced anymore after the function exits. > You were putting that into allroots, which seems to me to be > a fundamental error, even more so because it went in with the > same plan_name as the root it was cloned from. > > I think a better idea is to keep a list of just the subplan > names that we've assigned so far. That has a far clearer > charter, plus it can be updated immediately by choose_plan_name() > instead of relying on the caller to do the right thing later. > I coded this up, and was rather surprised to find that it changed > some regression outputs. On investigation, that's because > build_minmax_path() was actually doing the wrong thing later: > it was putting the wrong root into allroots, so that "minmax_1" > never became assigned and could be re-used later. Today, I discovered a disadvantage of the change from allroots to subplanNames. The concern that you raise about transient roots still seems entirely valid to me. However, the allroots list - if correctly constructed to exclude such transient roots - seems to me to have potential utility for extensions that subplanNames doesn't. What I wanted to do was use planner_shutdown_hook to copy some information from each PlannerInfo to the extension_state field of the PlannedStmt and, without allroots, there's no easy way to find all of them. I do think there are other ways to solve that problem. For instance, we could add a hook that's called when we invoke subquery_planner, similar to the way that planner_hook can be used to intercept calls to planner(). The new hook could then do whatever it likes at the end of each call to subquery_planner(). That has the disadvantage of making each call to subquery_planner() a little more expensive, but it might turn out that such a hook has other utility. For what I was trying to do today, I can probably even solve it using set_join_pathlist_hook. I wanted to capture the relid sets of all baserels and joinrels where rel->unique_rel is non-NULL, and I think that I could do that by having set_join_pathlist_hook add notice when save_jointype is JOIN_UNIQUE_INNER or JOIN_UNIQUE_OUTER, and then add innerrel or outerrel as appropriate to a list hanging off an object attached using Get/SetPlannerGlobalExtensionState, making sure not to add the same one more than once. But that only works in this specific case, and it's pretty indirect even for that. So I'm wondering if we ought to step back and rethink a bit. subplanNames ensures that we assign a different name to every PlannerInfo, but it doesn't give you any help finding the PlannerInfo given the name, or enumerating all of the PlannerInfo objects that exist. If we went back to allroots, and just fixed the problem of temporary entries creeping into the list, the core code wouldn't really notice the difference, but I think extensions would have an easier time. Thoughts? -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: