Re: NOT IN subquery optimization - Mailing list pgsql-hackers

From David Rowley
Subject Re: NOT IN subquery optimization
Date
Msg-id CAKJS1f8q4S+5Z7WSRDWJd__SwqMr12JdWKXTDo35ptzneRvZnw@mail.gmail.com
Whole thread Raw
In response to Re: NOT IN subquery optimization  ("Li, Zheng" <zhelli@amazon.com>)
Responses Re: NOT IN subquery optimization
List pgsql-hackers
On Sat, 2 Mar 2019 at 12:39, Li, Zheng <zhelli@amazon.com> wrote:
> However, if s.a is nullable, we would do this transformation:
>     select count(*) from big b where not exists(select 1 from small s
>     where s.a = b.a or s.a is null);

I understand you're keen to make this work, but you're assuming again
that forcing the planner into a nested loop plan is going to be a win
over the current behaviour. It may well be in some cases, but it's
very simple to show cases where it's a significant regression.

Using the same tables from earlier, and again with master:

alter table small alter column a drop not null;
select * from big where a not in(select a from small);
Time: 430.283 ms

Here's what you're proposing:

select * from big b where not exists(select 1 from small s where s.a =
b.a or s.a is null);
Time: 37419.646 ms (00:37.420)

about 80 times slower. Making "small" a little less small would likely
see that gap grow even further.

I think you're fighting a losing battle here with adding OR quals to
the join condition. This transformation happens so early in planning
that you really can't cost it out either.  I think the only way that
could be made to work satisfactorily would be with some execution
level support for it.  Short of that, you're left with just adding
checks that either side of the join cannot produce NULL values...
That's what I've proposed in [1].

[1] https://www.postgresql.org/message-id/CAKJS1f_OA5VeZx8A8H8mkj3uqEgOtmHBGCUA6%2BxqgmUJ6JQURw%40mail.gmail.com

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: VACUUM can finish an interrupted nbtree page split -- is that okay?
Next
From: Michael Paquier
Date:
Subject: Re: Tighten error control for OpenTransientFile/CloseTransientFile