Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series) - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)
Date
Msg-id CAKU4AWq9NmyufZuUWW=zbmhgOytq-R4KtoB6nC31rsMtNsr0ew@mail.gmail.com
Whole thread Raw
In response to Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)  (Andy Fan <zhihui.fan1213@gmail.com>)
Responses Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)  (Andy Fan <zhihui.fan1213@gmail.com>)
List pgsql-hackers
However the thing becomes complex with the below 2 cases.

1. SELECT * FROM t INNER JOIN j on t.nullable = q.b;
We know t.b will be not null **finally**. But the current plan may something
like this:

                QUERY PLAN
------------------------------------------
 Merge Join
   Merge Cond: (t.nullable = j.something)
   ->  Sort
         Sort Key: t.nullable
         ->  Seq Scan on t
   ->  Sort
         Sort Key: j.something
         ->  Seq Scan on j
(8 rows)

which means the Path "Seq Scan on t" still contains some null values. At least,
we should not assume t.nullable is "not nullable" the base relation stage.

2. SELECT t.a FROM j LEFT JOIN t ON t.b = t.a;
Even the t.a is not null by definition, but it may have null **finally** due to
the outer join.

The above 2 cases have been addressed by defining the notnullattrs on
every RelOptInfo, and maintaining them on every join. However,  per offline
discussion with David, IIUC,  there is a more case to think about. 

CREATE TABLE t (a INT, b INT);
SELECT * FROM t WHERE a = 1 and b = 2;

We know b is not null after we evaluate the qual b = 2,  but it may still
nullable when we just evaluate a = 1;  

I prefer to not handle it by saying the semantics of notnullattrs is correct
after we evaluate all the quals on its RelOptInfo. 

 
It would be good to agree on the correct representation for Vars that
cannot produce NULLs so that we don't shut the door on classes of
optimisation that require something other than what you need for your
case.


Looks we have to maintain not null on the general RelOptInfo level rather than Base
RelOptInfo.  But I don't want to teach Var about the notnull so far.  The reasons are: 1).
We need to maintain the Planner version and Parser version due to the VIEW case. 
2). We have to ignore the extra part  for equal(Var, Var) . 3). Var is usually shared among
different RelOptInfo. which means we have to maintain different copies for this purpose IIUC.

I assume we want to know if a Var is nullable with a function like. 
is_var_notnullable(Var *var,  Relids relids).  If so, we can define the data as below:

struct RelOptInfo {

Bitmapset** notnullattrs;
..
}; 

After this we can implement the function as: 

/*
 * is_var_notnullable
 *   Check if the var is nullable for a given RelOptIno after
 * all the quals on it have been evaluated. 
 * 
 * var is the var to check,  relids is the ids of a RelOptInfo
 * we will check on. 
 */
bool
is_var_notnullable(Var* var, Relids relids)
{
  RelOptInfo *rel = find_rel_by_relids(reldis);
  return bms_is_member(var->varattno, rel->notnullattrs[var->varno]);
}

Do you think this is a reasonable solution? 

 
bool
is_var_notnullable(Var* var, Relids relids)
{
  RelOptInfo *rel = find_rel_by_relids(reldis);
  return bms_is_member(var->varattno, rel->notnullattrs[var->varno]);
}

Probably we can make some hackers to reduce the notnullattrs's memory usage
overhead.

--
Best Regards

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: James Coleman
Date:
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays