Re: Allowing join removals for more join types - Mailing list pgsql-hackers

From David Rowley
Subject Re: Allowing join removals for more join types
Date
Msg-id CAApHDvp6+DL32VZWvJMnjNvB+XyB6H7zK2U2stNqrRp_yAek4g@mail.gmail.com
Whole thread Raw
In response to Re: Allowing join removals for more join types  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Allowing join removals for more join types  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, May 28, 2014 at 8:39 PM, David Rowley <dgrowleyml@gmail.com> wrote:
I've been working on adding join removal for join types other than left outer joins. 

The attached patch allows join removals for both sub queries with left joins and also semi joins where a foreign key can prove the existence of the record.

My longer term plan is to include inner joins too, but now that I have something to show for semi joins, I thought this would be a good time to post the patch just in case anyone can see any show stopper's with using foreign keys this way.

So with the attached you can do:

CREATE TABLE b (id INT NOT NULL PRIMARY KEY);
CREATE TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT NOT NULL REFERENCES b(id));

EXPLAIN (COSTS OFF)
SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
  QUERY PLAN
---------------
 Seq Scan on a
(1 row)

I think anti joins could use the same infrastructure but I'm not quite sure yet how to go about replacing the join with something like WHERE false.

I do think semi and anti joins are a far less useful case for join removals as inner joins are, but if we're already loading the foreign key constraints at plan time, then it seems like something that might be worth while checking.

Oh, quite likely the code that loads the foreign key constraints needs more work and probably included in the rel cache, but I don't want to go and to that until I get some feedback on the work so far.

Any comments are welcome.


The attached patch fixes a problem with SEMI join removal where I was missing adding a WHERE col IS NOT NULL check after a successful join removal. This filter is required to keep the query equivalent as the semi join would have filtered out the rows with a NULL join condition columns on the left side of the join.

In the attached I've also added support for ANTI joins, where the join can be removed it is replaced with a WHERE col IS NULL on the relation on the left side of the join. This is required as the only possible columns that could have matched would be NULL valued columns that are part of the foreign key.
 
I'm not quite there with inner joins yet. I'm still getting my head around just where the join quals are actually stored.

This area of the code is quite new to me, so I'm not quite sure I'm going about things in the correct way.
To make my intentions clean with this patch I've marked the file name with WIP.

Comments are welcome.

Regards

David Rowley 
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: pg_sleep() doesn't work well with recovery conflict interrupts.
Next
From: Robert Haas
Date:
Subject: Re: Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?