Home > mailing lists

Re: Tracking notnull attributes inside Var - Mailing list pgsql-hackers

From	Andy Fan
Subject	Re: Tracking notnull attributes inside Var
Date	May 20, 2022 08:18:32
Msg-id	CAKU4AWr=0rm=GoALn3dSwYT7zKFE0VHSt-sAnL1ECCPBV16MQQ@mail.gmail.com Whole thread Raw
In response to	Re: Tracking notnull attributes inside Var (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
List	pgsql-hackers

Tree view

Hi Ashutosh:

Nice to see you again!

On Tue, May 17, 2022 at 8:50 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:

On Sun, May 15, 2022 at 8:41 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

>
> The var in RelOptInfo->reltarget should have nullable = 0 but the var in
> RelOptInfo->baserestrictinfo should have nullable = 1; The beauty of this
> are: a). It can distinguish the two situations perfectly b). Whenever we want
> to know the nullable attribute of a Var for an expression, it is super easy to
> know. In summary, we need to maintain the nullable attribute at 2 different
> places. one is the before the filters are executed(baserestrictinfo, joininfo,
> ec_list at least). one is after the filters are executed (RelOptInfo.reltarget
> only?)

Thanks for identifying this. What you have written makes sense and it
might open a few optimization opportunities. But let me put down some
other thoughts here. You might want to take those into consideration
when designing your solution.

Thanks.

Do we want to just track nullable and non-nullable. May be we want
expand this class to nullable (var may be null), non-nullable (Var is
definitely non-NULL), null (Var will be always NULL).

Currently it doesn't support "Var will be always NULL" . Do you have any

use cases for this? and I can't think of too many cases where we can get

such information except something like "SELECT a FROM t WHERE a

IS NULL".

But the other way to look at this is along the lines of equivalence
classes. Equivalence classes record the expressions which are equal in
the final result of the query. The equivalence class members are not
equal at all the stages of query execution. But because they are
equal in the final result, we can impose that restriction on the lower
levels as well. Can we think of nullable in that fashion? If a Var is
non-nullable in the final result, we can impose that restriction on
the intermediate stages since rows with NULL values for that Var will
be filtered out somewhere. Similarly we could argue for null Var. But
knowledge that a Var is nullable in the final result does not impose a
NULL, non-NULL restriction on the intermediate stages. If we follow
this thought process, we don't need to differentiate Var at different
stages in query.

I agree this is an option. If so we need to track it under the PlannerInfo

struct but it would not be as fine-grained as my previous. Without

intermediate information, We can't know if a UnqiueKey contains multiple

NULLs, this would not be an issue for the "MARK Distinct as no-op" case,

but I'm not sure it is OK for other UniqueKey user cases. So my current idea

is I still prefer to maintain the intermediate information, unless we are sure it

costs too much or it is too complex to implement which I don't think so for now

at least. So if you have time to look at the attached patch, that would be super

great as well.

Best Regards

Andy Fan

pgsql-hackers by date:

From: Andy Fan
Date: 20 May 2022, 07:42:41
Subject: Re: Tracking notnull attributes inside Var

From: Tom Lane
Date: 20 May 2022, 08:25:10
Subject: Re: 15beta1 test failure on mips in isolation/expected/stats

Re: Tracking notnull attributes inside Var - Mailing list pgsql-hackers

Previous

Next