Home > mailing lists

Re: BUG #4949: NOT IN is prohibitive slower than the rewrite for medium to large sets - Mailing list pgsql-bugs

From	Ole Tange
Subject	Re: BUG #4949: NOT IN is prohibitive slower than the rewrite for medium to large sets
Date	July 28, 2009 21:09:08
Msg-id	ce534faa0907281357l7908a946y84380448af6899f7@mail.gmail.com Whole thread Raw
In response to	Re: BUG #4949: NOT IN is prohibitive slower than the rewrite for medium to large sets (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: BUG #4949: NOT IN is prohibitive slower than the rewrite for medium to large sets
List	pgsql-bugs

Tree view

On Tue, Jul 28, 2009 at 3:47 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> "Ole Tange" <postgresql.org@tange.dk> writes:
>> (modulo NULLs which seem to always cause problems in NOT INs).
>
>> Because it can be rewritten, NOT IN should never be much slower than the
>> rewritten solution, as PostgreSQL should simply rewrite NOT IN to the
>> above.
>
> Let's see, you understand that the rewrite violates the SQL standard
> semantics of NOT IN, but you think we should do it anyway?

Thanks for your kind reply.

Apparently my bug report was not quite clear. My rewrite example was
simply to show a way that could do a marvelous speedup on medium to
large sets. The correct dealing with NULL I am sure can be handled
just as efficiently.

As the performance of NOT IN is crippling for medium to large sets, I
suggest the way NOT IN is done should be similar to this:

SELECT foo FROM a WHERE a.key NOT IN (SELECT key FROM b);

is executed similar to this pseudo code (which deals with NULL):

CREATE TEMPORARY TABLE c AS SELECT key FROM a;
DELETE FROM c USING b WHERE c.key = b.key;
IF (SELECT count(*) FROM b WHERE b.key IS NULL LIMIT 1) = 0:
  -- there were no NULLs in b
  SELECT foo FROM a,c WHERE a.key = c.key;
ELSE
  -- there were NULLs in b, so just give an empty set back
  SELECT foo FROM a WHERE 1=2;
END IF

I know I can just rewrite my own code to do just that - and that is a
workaround for me. But the code would be more readable if I can simply
write NOT IN and expect it to perform just as well.

/Ole

pgsql-bugs by date:

From: Pavel Stehule
Date: 28 July 2009, 19:02:30
Subject: Re: fix: plpgsql: return query and dropped columns problem

From: "Jim Michaels"
Date: 28 July 2009, 22:57:53
Subject: BUG #4951: installation dir wrong for libpq compilation

Re: BUG #4949: NOT IN is prohibitive slower than the rewrite for medium to large sets - Mailing list pgsql-bugs

Previous

Next