teaching Var about NOT NULL - Mailing list pgsql-hackers

From David Rowley
Subject teaching Var about NOT NULL
Date
Msg-id CAApHDvpcj9q+7fM6Hu1ACU3W4r9aPmbvLk-x9oA_irdDCbH-YA@mail.gmail.com
Whole thread Raw
Responses Re: teaching Var about NOT NULL  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

While hacking away at implementing join removal support for ANTI-JOINs I realised that I couldn't just replace the join with a WHERE false condition... Let me explain...

With a query such as:

SELECT * FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = b.id);

Where a.b_id has a foreign key on b(id)

I'm working on a join removal patch which will turn this into:

SELECT * FROM a WHERE b_id IS NULL;

This seemed like a bit of a shame since with my test tables b_id is defined NOT NULL, but there seemed to be no way to tell if I needed to add a WHERE FALSE or a IS NULL check to the WHERE clause.

I quickly put together the attached patch which adds a "knownnotnull" bool field to Var which we can set to true when we're completely sure that the Var cannot contain any NULL values. I'm populating this from pg_attribute.attnotnull where I can and setting it to false where I can't see another way to tell for sure that nulls cannot exist.

The only use of knownnotnull that I've added to the patch is to turn a query such as:

SELECT * FROM a WHERE b_id IS NULL; 

To not scan the table, since id is defined as NOT NULL.

postgres=# alter table a alter column b_id drop not null;
ALTER TABLE
postgres=# explain select id from a where b_id is null;
                    QUERY PLAN
---------------------------------------------------
 Seq Scan on a  (cost=0.00..31.40 rows=11 width=4)
   Filter: (b_id IS NULL)
 Planning time: 0.340 ms
(3 rows)


postgres=# alter table a alter column b_id set not null;
ALTER TABLE
postgres=# explain select id from a where b_id is null;
                       QUERY PLAN
--------------------------------------------------------
 Result  (cost=0.00..31.40 rows=1 width=4)
   One-Time Filter: false
   ->  Seq Scan on a  (cost=0.00..31.40 rows=1 width=4)
 Planning time: 0.402 ms
(4 rows)

Having this extra flag could likely help optimise NOT IN(SELECT notnullcol FROM table) to allow this to become an ANTI-JOIN. It will also help join optimise join removals a little more.

The patch is just a few minutes old and there's no regression tests yet. I'd rather have some feedback before I proceed with it.

Regards

David Rowley
Attachment

pgsql-hackers by date:

Previous
From: Marc Mamin
Date:
Subject: Re: "pivot aggregation" with a patched intarray
Next
From: Euler Taveira
Date:
Subject: Re: Changeset Extraction v7.6.1