Re: Reduce "Var IS [NOT] NULL" quals during constant folding - Mailing list pgsql-hackers
From | Richard Guo |
---|---|
Subject | Re: Reduce "Var IS [NOT] NULL" quals during constant folding |
Date | |
Msg-id | CAMbWs4_soFRubFd64SX-GC+MA6RV=pZrwfn3oq89LNKoDuLjAg@mail.gmail.com Whole thread Raw |
In response to | Re: Reduce "Var IS [NOT] NULL" quals during constant folding (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Fri, Apr 11, 2025 at 4:45 AM Robert Haas <robertmhaas@gmail.com> wrote: > OK. Maybe I shouldn't be worrying about the table_open() / > table_close() here, because I see that you are right that > has_subclass() is nearby, which admittedly does not involve opening > the relation, but it does involve fetching from the syscache a tuple > that we wouldn't need to fetch if we had a Relation available at that > point. And I see now that expand_virtual_generated_columns() is also > in that area and works similar to your proposed function in that it > just opens and closes all the relations. Perhaps that's just the way > we do things at this very early stage of the planner? But I feel like > it might make sense to do some reorganization of this code, though, so > that it more resembles the phase 1 and phase 2 as you describe them. > Both expand_virtual_generated_columns() and collect_relation_attrs() > care about exactly those relations with rte->rtekind == RTE_RELATION, > and even if we have to open and close all of those relations once to > do this processing, perhaps we can avoid doing it twice, and maybe > avoid the has_subclass() call along the way? Yeah, I agree on this. This is the optimization I mentioned upthread in the last paragraph of [1] - retrieving the small portion of catalog information needed at an early stage in one place instead of three. Hopefully, this way we only need one table_open/table_close at the early stage of the planner. > Maybe we can hoist a loop > over parse->rtable up into subquery_planner and then have it call a > virtual-column expansion function and a non-null collection function > once per RTE_RELATION entry? Hmm, I'm afraid there might be some difficulty with this approach. The virtual-column expansion needs to be done after sublink pull-up to ensure that the virtual-column references within the SubLinks that should be transformed into joins can get expanded, while non-null collection needs to be done before sublink pull-up since we might want to leverage the non-null information to convert NOT IN sublinks to anti joins. What I had in mind is that we hoist a loop over parse->rtable before pull_up_sublinks to gather information about which columns of each relation are defined as NOT NULL and which are virtually generated. These information will be used in sublink pull-up and virtual-column expansion. We also call has_subclass for each relation that is maked 'inh' within that loop and clear the inh flag if needed. This seems to require a fair amount of code changes, so I'd like to get some feedback on this approach before proceeding with the implementation. > A related point that I'm noticing is that you record the not-NULL > information in the RangeTblEntry. I wonder whether that's going to be > a problem, because I think of the RangeTblEntry as a parse-time > structure and the RelOptInfo as a plan-time structure, meaning that we > shouldn't scribble on the former and that we should record any > plan-time details we need in the latter. I understand that the problem > is precisely that the RelOptInfo isn't yet available, but I'm not sure > that makes it OK to use the RangeTblEntry instead. Fair point. We should do our best to refrain from scribbling on the parsetree in the planner. I initially went with the hashtable approach as Tom suggested, but later found it quite handy to store the not-null information in the RangeTblEntry, especially since we do something similar with rte->inh. However, I've come to realize that inh may not be a good example to follow after all, so I'll go back to the hashtable method. Thank you for pointing that out before I went too far down the wrong path. [1] https://postgr.es/m/CAMbWs4-DryEm_U-juPn3HwUiwZRXW3jhfX18b_AFgrgihgq4kg@mail.gmail.com Thanks Richard
pgsql-hackers by date: