Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series) - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)
Date
Msg-id CAKU4AWobtkZV0SNHdVJ=VSpaLZK5tEBjPWmph9ZpMt9CQyiW_g@mail.gmail.com
Whole thread Raw
In response to Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)
List pgsql-hackers
Thank you all, friends!

On Fri, Feb 12, 2021 at 9:02 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 10 Feb 2021 at 16:18, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> v1-0001-Introduce-notnullattrs-field-in-RelOptInfo-to-ind.patch
>
> Introduce notnullattrs field in RelOptInfo to indicate which attr are not null
> in current query. The not null is judged by checking pg_attribute and query's
> restrictinfo. The info is only maintained at Base RelOptInfo and Partition's
> RelOptInfo level so far.
>
>
> Any thoughts?

I'm not that happy with what exactly the definition is of
RelOptInfo.notnullattrs.

The comment for the field says:
+ /* Not null attrs, start from -FirstLowInvalidHeapAttributeNumber */

So you could expect someone to assume that these are a Bitmapset of
attnums for all columns in the relation marked as NOT NULL.  However,
that's not true since you use find_nonnullable_vars() to chase down
quals that filter out NULL values and you mark those too.


The comment might be unclear,  but the behavior is on purpose. I want
to find more cases which can make the attr NOT NULL, all of them are
useful for UniqueKey stuff.  

 
The reason I don't really like this is that it really depends where
you want to use RelOptInfo.notnullattrs.  If someone wants to use it
to optimise something before the base quals are evaluated then they
might be unhappy that they found some NULLs.


Do you mean the notnullattrs is not set correctly before the base quals are
evaluated?  I think we have lots of data structures which are set just after some
stage.  but notnullattrs is special because it is set at more than 1 stage.  However
I'm doubtful it is unacceptable, Some fields ever change their meaning at different
stages like Var->varno.  If a user has a misunderstanding on it, it probably will find it
at the testing stage.  
 
I think you either need to explain in detail what the field means or
separate out the two meanings somehow.


Agreed.   Besides the not null comes from 2 places (metadata and quals), it also
means it is based on the relation, rather than the RelTarget.  for sample:  A is not
null,  but SELECT  return_null_udf(A)  FROM t,   return_null_udf is NULL.  I think
this is not well documented as well.  How about just change the documents as:

1.  /* Not null attrs, start from -FirstLowInvalidHeapAttributeNumber, the NOT NULL
      * comes from pg_attribtes and quals at different planning stages.
      */

or

2.  /* Not null attrs, start from -FirstLowInvalidHeapAttributeNumber, the NOT NULL
      * comes from pg_attribtes and quals at different planning stages. And it just means
      * the base attr rather than RelOptInfo->reltarget. 
      */

I don't like to separate them into 2 fields because it may make the usage harder a
bit as well. 
--
Best Regards

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: pg13.2: invalid memory alloc request size NNNN
Next
From: Andy Fan
Date:
Subject: Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)