Re: Tracking notnull attributes inside Var - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: Tracking notnull attributes inside Var |
Date | |
Msg-id | CAExHW5unn3mEBhzaoBurynFR295XGqRm6OqSfA1q5Gi+st7jCA@mail.gmail.com Whole thread Raw |
In response to | Tracking notnull attributes inside Var (Andy Fan <zhihui.fan1213@gmail.com>) |
Responses |
Re: Tracking notnull attributes inside Var
|
List | pgsql-hackers |
On Sun, May 15, 2022 at 8:41 AM Andy Fan <zhihui.fan1213@gmail.com> wrote: > > The var in RelOptInfo->reltarget should have nullable = 0 but the var in > RelOptInfo->baserestrictinfo should have nullable = 1; The beauty of this > are: a). It can distinguish the two situations perfectly b). Whenever we want > to know the nullable attribute of a Var for an expression, it is super easy to > know. In summary, we need to maintain the nullable attribute at 2 different > places. one is the before the filters are executed(baserestrictinfo, joininfo, > ec_list at least). one is after the filters are executed (RelOptInfo.reltarget > only?) Thanks for identifying this. What you have written makes sense and it might open a few optimization opportunities. But let me put down some other thoughts here. You might want to take those into consideration when designing your solution. Do we want to just track nullable and non-nullable. May be we want expand this class to nullable (var may be null), non-nullable (Var is definitely non-NULL), null (Var will be always NULL). But the other way to look at this is along the lines of equivalence classes. Equivalence classes record the expressions which are equal in the final result of the query. The equivalence class members are not equal at all the stages of query execution. But because they are equal in the final result, we can impose that restriction on the lower levels as well. Can we think of nullable in that fashion? If a Var is non-nullable in the final result, we can impose that restriction on the intermediate stages since rows with NULL values for that Var will be filtered out somewhere. Similarly we could argue for null Var. But knowledge that a Var is nullable in the final result does not impose a NULL, non-NULL restriction on the intermediate stages. If we follow this thought process, we don't need to differentiate Var at different stages in query. > > Come to JoinRel, we still need to maintain the 2 different cases as well. > > As for the joinrel.reltarget, currently it looks up the inputrel's reltarget to > get the Var, so it is easy to inherit from Var->nullable from inputrel, but > we need to consider the new changes introduced by current join, > Like new NOT nullable attributes because of join clauses OR new nullable > attributes because of outer join. Everything looks good for now. Yes, if we want to maintain nullness at different stages in the query. > > The hard part is RelOptInfo.joininfo & root->eq_classes. All of them uses > the shared RestrictInfo, and it is unclear which Var->nullable should be used in > them. To not provide a wrong answer, I think we can assume nullable=-1 (unknown) > and let the upper layer decides what to do (do we have known use cases to use > the nullable attribute here?). I think what applies to baserestrictinfo and reltarget also applies to joininfo and join's reltarget. There will be three stages - join clauses, join quals and reltarget. In EQs the Vars in RestrictInfo will come from joininfo but EQ member Vars will derive their nullable-ness from corresponding reltarget. I can be wrong though. > > More considerations about this strategy: > 1. We might use more memory for different var copies, the only known cases > RelOptInfo->reltarget for now. When a Var is copied the whole expression tree needs to be copied. That might be more memory than just copies of Var nodes. > 2. _equalVar() has more complex semantics: shall we consider nulls or not. This is interesting. It might have impact on set_plan_references and planner's ability to search and match expressions. But if we take the approach I have suggested earlier, this question will not arise. -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: