Re: NOT IN subquery optimization - Mailing list pgsql-hackers

From Li, Zheng
Subject Re: NOT IN subquery optimization
Date
Msg-id B5035CAF-AFE6-41D2-A71C-7726E462C896@amazon.com
Whole thread Raw
In response to Re: NOT IN subquery optimization  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: NOT IN subquery optimization  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
I'm totally fine with setting the target to PG13.

--
I'm interested to know how this works without testing for inner
nullability.  If any of the inner side's join exprs are NULL then no
records can match. What do you propose to work around that?
--

We still check for inner side's nullability, when it is nullable we
append a "var is NULL" to the anti join condition. So every outer
tuple is going to evaluate to true on the join condition when there
is indeed a null entry in the inner. 
Actually I think the nested loop anti join can end early in this case,
but I haven't find a way to do it properly, this may be one other reason
why we need a new join type for NOT IN.

e.g.
explain select count(*) from s where u not in (select n from l);
                                     QUERY PLAN
------------------------------------------------------------------------------------
 Aggregate  (cost=2892.88..2892.89 rows=1 width=8)
   ->  Nested Loop Anti Join  (cost=258.87..2892.88 rows=1 width=0)
         ->  Seq Scan on s  (cost=0.00..1.11 rows=11 width=4)
         ->  Bitmap Heap Scan on l  (cost=258.87..262.88 rows=1 width=4)
               Recheck Cond: ((s.u = n) OR (n IS NULL))
               ->  BitmapOr  (cost=258.87..258.87 rows=1 width=0)
                     ->  Bitmap Index Scan on l_n  (cost=0.00..4.43 rows=1 width=0)
                           Index Cond: (s.u = n)
                     ->  Bitmap Index Scan on l_n  (cost=0.00..4.43 rows=1 width=0)
                           Index Cond: (n IS NULL)

Zheng


pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: NOT IN subquery optimization
Next
From: David Rowley
Date:
Subject: Re: NOT IN subquery optimization