Re: High cost of ... where ... not in (select ...) - Mailing list pgsql-performance

From Robert Haas
Subject Re: High cost of ... where ... not in (select ...)
Date
Msg-id 603c8f070906161730o4b577483o2a90c73549c3990e@mail.gmail.com
Whole thread Raw
In response to Re: High cost of ... where ... not in (select ...)  (Aaron Turner <synfinatic@gmail.com>)
Responses Re: High cost of ... where ... not in (select ...)
List pgsql-performance
On Tue, Jun 16, 2009 at 7:39 PM, Aaron Turner<synfinatic@gmail.com> wrote:
> On Tue, Jun 16, 2009 at 2:37 PM, Alvaro
> Herrera<alvherre@commandprompt.com> wrote:
>> Aaron Turner escribió:
>>> I'm trying to figure out how to optimize this query (yes, I ran vacuum/analyze):
>>>
>>> musecurity=# explain DELETE FROM muapp.pcap_store WHERE pcap_storeid
>>> NOT IN (SELECT pcap_storeid FROM muapp.pcap_store_log);
>>
>> What PG version is this?
>
> Doh, just realized I didn't reply back to list.   It's version 8.3.3.
>
> Also, pcap_storeid is unique in pcap_store_log

Speaking as one who has dealt with this frustration more than once,
you can typically get better performance with something like:

DELETE FROM muapp.pcap_store AS x
FROM muapp.pcap_store a
LEFT JOIN muapp.pcap_store_log b ON a.pcap_store_id = b.pcap_storeid
WHERE x.pcap_storeid = a.pcap_storeid AND b.pcap_storeid IS NULL

This is emphatically lame, but there you have it.  It's first of all
lame that we can't do a better job optimizing NOT-IN, at least when
the expression within the subselect is known to be not-null, and it's
secondly lame that the syntax of DELETE doesn't permit a LEFT JOIN
without a self-JOIN.

</rant>

...Robert

pgsql-performance by date:

Previous
From: Aaron Turner
Date:
Subject: Re: High cost of ... where ... not in (select ...)
Next
From: Aaron Turner
Date:
Subject: Re: High cost of ... where ... not in (select ...)