Re: Reduce "Var IS [NOT] NULL" quals during constant folding - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Reduce "Var IS [NOT] NULL" quals during constant folding |
Date | |
Msg-id | CA+TgmoZxYG+caOw2iA2NPVM65mb3KUT6v9+wEDKSJzqyHbUG2w@mail.gmail.com Whole thread Raw |
In response to | Re: Reduce "Var IS [NOT] NULL" quals during constant folding (Richard Guo <guofenglinux@gmail.com>) |
Responses |
Re: Reduce "Var IS [NOT] NULL" quals during constant folding
Re: Reduce "Var IS [NOT] NULL" quals during constant folding |
List | pgsql-hackers |
On Sun, Apr 6, 2025 at 10:59 PM Richard Guo <guofenglinux@gmail.com> wrote: > That's correct. As I mentioned earlier, I believe attnotnull isn't > the only piece of information we need to gather early on. My general > idea is to separate the collection of catalog information into two > phases: > > * Phase 1 occurs at an early stage and collects the small portion of > catalog information that is needed for constant folding, setting the > inh flag for a relation, or expanding virtual generated columns. All > these happen very early in the planner, before partition pruning. > > * Phase 2 collects the majority of the catalog information and occurs > when building the RelOptInfos, like what get_relation_info does. OK. Maybe I shouldn't be worrying about the table_open() / table_close() here, because I see that you are right that has_subclass() is nearby, which admittedly does not involve opening the relation, but it does involve fetching from the syscache a tuple that we wouldn't need to fetch if we had a Relation available at that point. And I see now that expand_virtual_generated_columns() is also in that area and works similar to your proposed function in that it just opens and closes all the relations. Perhaps that's just the way we do things at this very early stage of the planner? But I feel like it might make sense to do some reorganization of this code, though, so that it more resembles the phase 1 and phase 2 as you describe them. Both expand_virtual_generated_columns() and collect_relation_attrs() care about exactly those relations with rte->rtekind == RTE_RELATION, and even if we have to open and close all of those relations once to do this processing, perhaps we can avoid doing it twice, and maybe avoid the has_subclass() call along the way? Maybe we can hoist a loop over parse->rtable up into subquery_planner and then have it call a virtual-column expansion function and a non-null collection function once per RTE_RELATION entry? A related point that I'm noticing is that you record the not-NULL information in the RangeTblEntry. I wonder whether that's going to be a problem, because I think of the RangeTblEntry as a parse-time structure and the RelOptInfo as a plan-time structure, meaning that we shouldn't scribble on the former and that we should record any plan-time details we need in the latter. I understand that the problem is precisely that the RelOptInfo isn't yet available, but I'm not sure that makes it OK to use the RangeTblEntry instead. > I'm confused here. AFAICS, we don't open the relation before fetching > relhassubclass, according to the code that sets the inh flag in > subquery_planner. Additionally, I do not see we pass down the > Relation object to get_relation_info. In get_relation_info, we call > table_open to obtain the Relation object, use it to retrieve the > catalog information, and then call table_close to close the Relation. You're right. I don't know what I was thinking. > IIUC, you're concerned about calling table_open/table_close to > retrieve catalog information. Do you know of a better way to retrieve > catalog information without calling table_open/table_close? I find > the table_open/table_close pattern is quite common in the current > code. In addition to get_relation_info(), I've also seen it in > get_relation_constraints(), get_relation_data_width(), and others. You're also right about this. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: