Re: Patch to support SEMI and ANTI join removal - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Patch to support SEMI and ANTI join removal
Date
Msg-id 53FC8E69.9060301@vmware.com
Whole thread Raw
In response to Re: Patch to support SEMI and ANTI join removal  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Patch to support SEMI and ANTI join removal
Re: Patch to support SEMI and ANTI join removal
List pgsql-hackers
On 08/26/2014 03:28 PM, David Rowley wrote:
> Any ideas or feedback on this would be welcome

Before someone spends time reviewing this patch, are you sure this is 
worth the effort? It seems like very narrow use case to me. I understand 
removing LEFT and INNER joins, but the case for SEMI and ANTI joins 
seems a lot thinner. Unnecessary LEFT and INNER joins can easily creep 
into a query when views are used, for example, but I can't imagine that 
happening for a SEMI or ANTI join. Maybe I'm lacking imagination. If 
someone has run into a query in the wild that would benefit from this, 
please raise your hand.

If I understood correctly, you're planning to work on INNER join removal 
too. How much of the code in this patch is also required for INNER join 
removal, and how much is specific to SEMI and ANTI joins?

Just so everyone is on the same page on what kind of queries this helps 
with, here are some examples from the added regression tests:

> -- Test join removals for semi and anti joins
> CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, val INT);
> CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
> -- should remove semi join to b
> EXPLAIN (COSTS OFF)
> SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
>           QUERY PLAN
> ------------------------------
>  Seq Scan on a
>    Filter: (b_id IS NOT NULL)
> (2 rows)
>
> -- should remove semi join to b
> EXPLAIN (COSTS OFF)
> SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
>           QUERY PLAN
> ------------------------------
>  Seq Scan on a
>    Filter: (b_id IS NOT NULL)
> (2 rows)
>
> -- should remove anti join to b
> EXPLAIN (COSTS OFF)
> SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
>         QUERY PLAN
> --------------------------
>  Seq Scan on a
>    Filter: (b_id IS NULL)
> (2 rows)

- Heikki



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: a7ae1dc has broken the windows builds
Next
From: Alvaro Herrera
Date:
Subject: Re: replicating DROP commands across servers