Thread: Reduce "Var IS [NOT] NULL" quals during constant folding
Currently, we have an optimization that reduces an IS [NOT] NULL qual on a NOT NULL column to constant true or constant false, provided we can prove that the input expression of the NullTest is not nullable by any outer joins. This deduction happens pretty late in planner, during the distribution of quals to relations in query_planner(). As mentioned in [1], doing it at such a late stage has some drawbacks. Ideally, this deduction should happen during constant folding. However, we don't have the per-relation information about which columns are defined as NOT NULL available at that point. That information is collected from catalogs when building RelOptInfos for base or other relations. I'm wondering whether we can collect that information while building the RangeTblEntry for a base or other relation, so that it's available before constant folding. This could also enable other optimizations, such as checking if a NOT IN subquery's output columns and its left-hand expressions are all certainly not NULL, in which case we can convert it to an anti-join. Attached is a draft patch to reduce NullTest on a NOT NULL column in eval_const_expressions. Initially, I planned to get rid of restriction_is_always_true and restriction_is_always_false altogether, since we now perform the reduction of "Var IS [NOT] NULL" quals in eval_const_expressions. However, removing them would prevent us from reducing some IS [NOT] NULL quals that we were previously able to reduce, because (a) the self-join elimination may introduce new IS NOT NULL quals after the constant folding, and (b) if some outer joins are converted to inner joins, previously irreducible NullTest quals may become reducible. So I think maybe we'd better keep restriction_is_always_true and restriction_is_always_false as-is. Any thoughts? [1] https://postgr.es/m/2323997.1740623184@sss.pgh.pa.us Thanks Richard
Attachment
On Fri, Mar 21, 2025 at 6:14 PM Richard Guo <guofenglinux@gmail.com> wrote: > I'm wondering whether we can collect that information while building > the RangeTblEntry for a base or other relation, so that it's available > before constant folding. This could also enable other optimizations, > such as checking if a NOT IN subquery's output columns and its > left-hand expressions are all certainly not NULL, in which case we can > convert it to an anti-join. > > Attached is a draft patch to reduce NullTest on a NOT NULL column in > eval_const_expressions. FWIW, reducing "Var IS [NOT] NULL" quals during constant folding can somewhat influence the decision on join ordering later. For instance, create table t (a int not null, b int); select * from t t1 left join (t t2 left join t t3 on t2.a is not null) on t1.b = t2.b; For this query, "t2.a is not null" is reduced to true during constant folding and then ignored, which leads to us being unable to commute t1/t2 join with t2/t3 join. OTOH, constant-folding NullTest for Vars may enable join orders that were previously impossible. For instance, select * from t t1 left join (t t2 left join t t3 on t2.a is null or t2.b = t3.b) on t1.b = t2.b; Previously the t2/t3 join's clause is not strict for t2 due to the IS NULL qual, which prevents t2/t3 join from commuting with t1/t2 join. Now, the IS NULL qual is removed during constant folding, allowing us to generate a plan with the join order (t1/t2)/t3. Not quite sure if this is something we need to worry about. Thanks Richard
On Fri, Mar 21, 2025 at 10:21 AM Richard Guo <guofenglinux@gmail.com> wrote: > Not quite sure if this is something we need to worry about. I haven't really dug into this but I bet it's not that serious, in the sense that we could probably work around it with more logic if we really need to. However, I'm a bit concerned about the overall premise of the patch set. It feels like it is moving something that really ought to happen at optimization time back to parse time. I have a feeling that's going to break something, although I am not sure right now exactly what. Wouldn't it be better to have this still happen in the planner, but sooner than it does now? -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > However, I'm a bit concerned about the overall premise of the patch > set. It feels like it is moving something that really ought to happen > at optimization time back to parse time. I have a feeling that's going > to break something, although I am not sure right now exactly what. Ugh, no, that is *completely* unworkable. Suppose that the user does CREATE VIEW, and the parse tree recorded for that claims that column X is not-nullable. Then the user drops the not-null constraint, and then asks to execute the view. We'll optimize on the basis of stale information. The way to make this work is what I said before: move the planner's collection of relation information to somewhere a bit earlier in the planner. But not to outside the planner. regards, tom lane
On Fri, Mar 21, 2025 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> However, I'm a bit concerned about the overall premise of the patch
> set. It feels like it is moving something that really ought to happen
> at optimization time back to parse time. I have a feeling that's going
> to break something, although I am not sure right now exactly what.
Ugh, no, that is *completely* unworkable. Suppose that the user
does CREATE VIEW, and the parse tree recorded for that claims that
column X is not-nullable. Then the user drops the not-null
constraint, and then asks to execute the view. We'll optimize on
the basis of stale information.
The way to make this work is what I said before: move the planner's
collection of relation information to somewhere a bit earlier in
the planner. But not to outside the planner.
Reading this reminded me of the existing issue in [1] where we've broken session isolation of temporary relation data. There it feels like we are making decisions in the parser that really belong in the planner since catalog data is needed to determine relpersistence in many cases. If we are looking for a spot "earlier in the planner" to attach relation information, figuring out how to use that to improve matters related to relpersistence seems warranted.
David J.
On Sat, Mar 22, 2025 at 1:12 AM Robert Haas <robertmhaas@gmail.com> wrote: > However, I'm a bit concerned about the overall premise of the patch > set. It feels like it is moving something that really ought to happen > at optimization time back to parse time. I have a feeling that's going > to break something, although I am not sure right now exactly what. > Wouldn't it be better to have this still happen in the planner, but > sooner than it does now? You're right. It's just flat wrong to collect catalog information in the parser and use it in the planner. As Tom pointed out, the catalog information could change in between, which would cause us to use stale data. Yeah, this should still happen in the planner, perhaps before pull_up_sublinks, if we plan to leverage that info to convert NOT IN to anti-join. Thanks Richard
On Sat, Mar 22, 2025 at 2:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ugh, no, that is *completely* unworkable. Suppose that the user > does CREATE VIEW, and the parse tree recorded for that claims that > column X is not-nullable. Then the user drops the not-null > constraint, and then asks to execute the view. We'll optimize on > the basis of stale information. Thanks for pointing this out. > The way to make this work is what I said before: move the planner's > collection of relation information to somewhere a bit earlier in > the planner. But not to outside the planner. I'm considering moving the collection of attnotnull information before pull_up_sublinks, in hopes of leveraging this info to pull up NOT IN in the future, something like attached. Maybe we could also collect the attgenerated information in the same routine, making life easier for expand_virtual_generated_columns. Another issue I found is that in convert_EXISTS_to_ANY, we pass the parent's root to eval_const_expressions, which can cause problems when reducing "Var IS [NOT] NULL" quals. To fix, v2 patch constructs up a dummy PlannerInfo with "glob" link set to the parent's and "parser" link set to the subquery. I believe these are the only fields used. Thanks Richard
Attachment
On Sun, Mar 23, 2025 at 6:25 PM Richard Guo <guofenglinux@gmail.com> wrote: > On Sat, Mar 22, 2025 at 2:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > The way to make this work is what I said before: move the planner's > > collection of relation information to somewhere a bit earlier in > > the planner. But not to outside the planner. > I'm considering moving the collection of attnotnull information before > pull_up_sublinks, in hopes of leveraging this info to pull up NOT IN > in the future, something like attached. Here is an updated version of the patch with some cosmetic changes and a more readable commit message. I'm wondering if it's good enough to be pushed. Any comments? Thanks Richard
Attachment
Richard Guo <guofenglinux@gmail.com> 于2025年3月26日周三 10:16写道:
On Sun, Mar 23, 2025 at 6:25 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Sat, Mar 22, 2025 at 2:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > The way to make this work is what I said before: move the planner's
> > collection of relation information to somewhere a bit earlier in
> > the planner. But not to outside the planner.
> I'm considering moving the collection of attnotnull information before
> pull_up_sublinks, in hopes of leveraging this info to pull up NOT IN
> in the future, something like attached.
Here is an updated version of the patch with some cosmetic changes and
a more readable commit message. I'm wondering if it's good enough to
be pushed. Any comments?
The comment about notnullattnums in struct RangeTblEntry says that:
* notnullattnums is zero-based set containing attnums of NOT NULL
* columns.
* columns.
But in get_relation_notnullatts():
rte->notnullattnums = bms_add_member(rte->notnullattnums,
i + 1);
i + 1);
The notnullattnums seem to be 1-based.
Thanks,
Tender Wang
On Wed, Mar 26, 2025 at 3:06 PM Tender Wang <tndrwang@gmail.com> wrote: > The comment about notnullattnums in struct RangeTblEntry says that: > * notnullattnums is zero-based set containing attnums of NOT NULL > * columns. > > But in get_relation_notnullatts(): > rte->notnullattnums = bms_add_member(rte->notnullattnums, > i + 1); > > The notnullattnums seem to be 1-based. This corresponds to the attribute numbers in Var nodes; you can consider zero as representing a whole-row Var. Thanks Richard
On Wed, 26 Mar 2025 at 19:31, Richard Guo <guofenglinux@gmail.com> wrote: > > On Wed, Mar 26, 2025 at 3:06 PM Tender Wang <tndrwang@gmail.com> wrote: > > The comment about notnullattnums in struct RangeTblEntry says that: > > * notnullattnums is zero-based set containing attnums of NOT NULL > > * columns. > > > > But in get_relation_notnullatts(): > > rte->notnullattnums = bms_add_member(rte->notnullattnums, > > i + 1); > > > > The notnullattnums seem to be 1-based. > > This corresponds to the attribute numbers in Var nodes; you can > consider zero as representing a whole-row Var. Yeah, and a negative number is a system attribute, which the Bitmapset can't represent... The zero-based comment is meant to inform the reader that they don't need to offset by FirstLowInvalidHeapAttributeNumber when indexing the Bitmapset. If there's some confusion about that then maybe the wording could be improved. I used "zero-based" because I wanted to state what it was and that was the most brief terminology that I could think of to do that. The only other way I thought about was to say that "it's not offset by FirstLowInvalidHeapAttributeNumber", but I thought it was better to say what it is rather than what it isn't. I'm open to suggestions if people are confused about this. David
On Wed, Mar 26, 2025 at 6:45 PM David Rowley <dgrowleyml@gmail.com> wrote: > On Wed, 26 Mar 2025 at 19:31, Richard Guo <guofenglinux@gmail.com> wrote: > > On Wed, Mar 26, 2025 at 3:06 PM Tender Wang <tndrwang@gmail.com> wrote: > > > The comment about notnullattnums in struct RangeTblEntry says that: > > > * notnullattnums is zero-based set containing attnums of NOT NULL > > > * columns. > > > > > > But in get_relation_notnullatts(): > > > rte->notnullattnums = bms_add_member(rte->notnullattnums, > > > i + 1); > > > > > > The notnullattnums seem to be 1-based. > > This corresponds to the attribute numbers in Var nodes; you can > > consider zero as representing a whole-row Var. > Yeah, and a negative number is a system attribute, which the Bitmapset > can't represent... The zero-based comment is meant to inform the > reader that they don't need to offset by > FirstLowInvalidHeapAttributeNumber when indexing the Bitmapset. If > there's some confusion about that then maybe the wording could be > improved. I used "zero-based" because I wanted to state what it was > and that was the most brief terminology that I could think of to do > that. The only other way I thought about was to say that "it's not > offset by FirstLowInvalidHeapAttributeNumber", but I thought it was > better to say what it is rather than what it isn't. > > I'm open to suggestions if people are confused about this. I searched the current terminology used in code and can find "offset by FirstLowInvalidHeapAttributeNumber", but not "not offset by FirstLowInvalidHeapAttributeNumber". I think "zero-based" should be sufficient to indicate that this bitmapset is offset by zero, not by FirstLowInvalidHeapAttributeNumber. So I'm fine to go with "zero-based". I'm planning to push this patch soon, barring any objections. Thanks Richard
Richard Guo <guofenglinux@gmail.com> writes: > I'm planning to push this patch soon, barring any objections. FWIW, I have not reviewed it at all. regards, tom lane
On Thu, Mar 27, 2025 at 10:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Richard Guo <guofenglinux@gmail.com> writes: > > I'm planning to push this patch soon, barring any objections. > FWIW, I have not reviewed it at all. Oh, sorry. I'll hold off on pushing it. Thanks Richard
On Thu, Mar 27, 2025 at 10:08 AM Richard Guo <guofenglinux@gmail.com> wrote: > On Thu, Mar 27, 2025 at 10:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Richard Guo <guofenglinux@gmail.com> writes: > > > I'm planning to push this patch soon, barring any objections. > > > FWIW, I have not reviewed it at all. > > Oh, sorry. I'll hold off on pushing it. As a general point, non-trivial patches should really get some substantive review before they are pushed. Please don't be in a rush to commit. It is very common for the time from when a patch is first posted to commit to be 3-6 months even for committers. Posting a brand-new feature patch on March 21st and then pressing to commit on March 27th is really not something you should be doing. I think it's particularly inappropriate here where you actually got a review that pointed out a serious design problem and then had to change the design. If you didn't get it right on the first try, you shouldn't be too confident that you did it perfectly the second time, either. I took a look at this today and I'm not entirely comfortable with this: + rel = table_open(rte->relid, NoLock); + + /* Record NOT NULL columns for this relation. */ + get_relation_notnullatts(rel, rte); + + table_close(rel, NoLock); As a general principle, I have found that it's usually a sign that something has been designed poorly when you find yourself wanting to open a relation, get exactly one piece of information, and close the relation again. That is why, today, all the information that the planner needs about a particular relation is retrieved by get_relation_info(). We do not just wander around doing random catalog lookups wherever we need some critical detail. This patch increases the number of places where we fetch relation data from 1 to 2, but it's still the case that almost everything happens in get_relation_info(), and there's now just exactly this 1 thing that is done in a different place. That doesn't seem especially nice. I thought the idea was going to be to move get_relation_info() to an earlier stage, not split one thing out of it while leaving everything else the same. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Apr 1, 2025 at 1:55 AM Robert Haas <robertmhaas@gmail.com> wrote: > As a general principle, I have found that it's usually a sign that > something has been designed poorly when you find yourself wanting to > open a relation, get exactly one piece of information, and close the > relation again. That is why, today, all the information that the > planner needs about a particular relation is retrieved by > get_relation_info(). We do not just wander around doing random catalog > lookups wherever we need some critical detail. This patch increases > the number of places where we fetch relation data from 1 to 2, but > it's still the case that almost everything happens in > get_relation_info(), and there's now just exactly this 1 thing that is > done in a different place. That doesn't seem especially nice. I > thought the idea was going to be to move get_relation_info() to an > earlier stage, not split one thing out of it while leaving everything > else the same. I initially considered moving get_relation_info() to an earlier stage, where we would collect all the per-relation data by relation OID. Later, when building the RelOptInfos, we could link them to the per-relation-OID data. However, I gave up this idea because I realized it would require retrieving a whole bundle of catalog information that isn't needed until after the RelOptInfos are built, such as max_attr, pages, tuples, reltablespace, parallel_workers, extended statistics, etc. And we may also need to create the IndexOptInfos for the relation's indexes. It seems to me that it's not a trivial task to move get_relation_info() before building the RelOptInfos, and more importantly, it's unnecessary most of the time. In other words, of the many pieces of catalog information we need to retrieve for a relation, only a small portion is needed at an early stage. As far as I can see, this small portion includes: * relhassubclass, which we retrieve immediately after we have finished adding rangetable entries. * attgenerated, which we retrieve in expand_virtual_generated_columns. * attnotnull, as discussed here, which should ideally be retrieved before pull_up_sublinks. My idea is to retrieve only this small portion at an early stage, and still defer collecting the majority of the catalog information until building the RelOptInfos. It might be possible to optimize by retrieving this small portion in one place instead of three, but moving the entire get_relation_info() to an earlier stage doesn't seem like a good idea to me. It could be argued that the separation of catalog information collection isn't very great, but it seems this isn't something new in this patch. So I respectfully disagree with your statement that "all the information that the planner needs about a particular relation is retrieved by get_relation_info()", and that "there's now just exactly this 1 thing that is done in a different place". For instance, relhassubclass and attgenerated are retrieved before expression preprocessing, a relation's constraint expressions are retrieved when setting the relation's size estimates, and more. Thanks Richard
On Tue, Apr 1, 2025 at 2:34 AM Richard Guo <guofenglinux@gmail.com> wrote: > However, I gave up this idea because I realized it would require > retrieving a whole bundle of catalog information that isn't needed > until after the RelOptInfos are built, such as max_attr, pages, > tuples, reltablespace, parallel_workers, extended statistics, etc. Why is that bad? I mean, if we're going to need that information anyway, then gathering it at earlier stage doesn't hurt. Of course, if we move it too early, say before partition pruning, then we might gather information we don't really need and hurt performance. But otherwise it doesn't seem to hurt anything. > And we may also need to create the IndexOptInfos for the relation's > indexes. It seems to me that it's not a trivial task to move > get_relation_info() before building the RelOptInfos, and more > importantly, it's unnecessary most of the time. But again, if the work is going to have to be done anyway, who cares? > It could be argued that the separation of catalog information > collection isn't very great, but it seems this isn't something new in > this patch. So I respectfully disagree with your statement that "all > the information that the planner needs about a particular relation is > retrieved by get_relation_info()", and that "there's now just exactly > this 1 thing that is done in a different place". For instance, > relhassubclass and attgenerated are retrieved before expression > preprocessing, a relation's constraint expressions are retrieved when > setting the relation's size estimates, and more. Nonetheless I think we ought to be trying to consolidate more things into get_relation_info(), not disperse some of the things that are there to other places. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Apr 2, 2025 at 4:34 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Apr 1, 2025 at 2:34 AM Richard Guo <guofenglinux@gmail.com> wrote: > > However, I gave up this idea because I realized it would require > > retrieving a whole bundle of catalog information that isn't needed > > until after the RelOptInfos are built, such as max_attr, pages, > > tuples, reltablespace, parallel_workers, extended statistics, etc. > Why is that bad? I mean, if we're going to need that information > anyway, then gathering it at earlier stage doesn't hurt. Of course, if > we move it too early, say before partition pruning, then we might > gather information we don't really need and hurt performance. But > otherwise it doesn't seem to hurt anything. The attnotnull catalog information being discussed here is intended for use during constant folding (and possibly sublink pull-up), which occurs long before partition pruning. Am I missing something? Additionally, I'm doubtful that the collection of relhassubclass can be moved after partition pruning. How can we determine whether a relation is inheritable without retrieving its relhassubclass information? As for attgenerated, I also doubt that it can be retrieved after partition pruning. It is used to expand virtual generated columns, which can result in new PlaceHolderVars. This means it must be done before deconstruct_jointree, as make_outerjoininfo requires all active PlaceHolderVars to be present in root->placeholder_list. If these pieces of information cannot be retrieved after partition pruning, and for performance reasons we don't want to move the gathering of the majority of the catalog information before partition pruning, then it seems to me that moving get_relation_info() to an earlier stage might not be very meaningful. What do you think? > > It could be argued that the separation of catalog information > > collection isn't very great, but it seems this isn't something new in > > this patch. So I respectfully disagree with your statement that "all > > the information that the planner needs about a particular relation is > > retrieved by get_relation_info()", and that "there's now just exactly > > this 1 thing that is done in a different place". For instance, > > relhassubclass and attgenerated are retrieved before expression > > preprocessing, a relation's constraint expressions are retrieved when > > setting the relation's size estimates, and more. > Nonetheless I think we ought to be trying to consolidate more things > into get_relation_info(), not disperse some of the things that are > there to other places. I agree with the general idea of more consolidation and less dispersion, but squashing all information collection into get_relation_info() seems quite challenging. However, I think we can make at least one optimization, as I mentioned upthread — retrieving the small portion of catalog information needed at an early stage in one place instead of three. Perhaps we could start with moving the retrieval of relhassubclass into collect_relation_attrs() and rename this function to better reflect this change. Thanks Richard
On Tue, Apr 1, 2025 at 10:14 PM Richard Guo <guofenglinux@gmail.com> wrote: > The attnotnull catalog information being discussed here is intended > for use during constant folding (and possibly sublink pull-up), which > occurs long before partition pruning. Am I missing something? Hmm, OK, so you think that we need to gather this information early, so that we can do constant folding correctly, but you don't want to gather everything that get_relation_info() does at this stage, because then we're doing extra work on partitions that might later be pruned. Is that correct? > Additionally, I'm doubtful that the collection of relhassubclass can > be moved after partition pruning. How can we determine whether a > relation is inheritable without retrieving its relhassubclass > information? We can't -- but notice that we open the relation before fetching relhassubclass, and then pass down the Relation object to where get_relation_info() is ultimately called, so that we do not repeatedly open and close the Relation. I don't know if I can say exactly what's going to go wrong if we add an extra table_open()/table_close() as you do in the patch, but I've seen enough performance and correctness problems with such code over the years to make me skeptical. -- Robert Haas EDB: http://www.enterprisedb.com