Re: High cost of ... where ... not in (select ...) - Mailing list pgsql-performance

From Aaron Turner
Subject Re: High cost of ... where ... not in (select ...)
Date
Msg-id 1ca1c1410906161846v73e20df0p1d46126ec8ffd747@mail.gmail.com
Whole thread Raw
In response to Re: High cost of ... where ... not in (select ...)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
On Tue, Jun 16, 2009 at 6:36 PM, Robert Haas<robertmhaas@gmail.com> wrote:
> On Tue, Jun 16, 2009 at 9:23 PM, Aaron Turner<synfinatic@gmail.com> wrote:

>>  DELETE FROM muapp.pcap_store AS x
>>        USING muapp.pcap_store AS a
>>        LEFT JOIN muapp.pcap_store_log b ON a.pcap_storeid =
>> b.pcap_storeid WHERE x.pcap_storeid = a.pcap_storeid AND
>> b.pcap_storeid IS NULL;
>>
>> Is that right?
>
> Woops, yes, I think that's it.
>
> (but I don't guarantee that it won't blow up your entire universe, so
> test it carefully first)

Yeah, doing that now... taking a bit longer then I expected (took
~5min on rather slow hardware- everything is on a pair of 10K RAID1
drives), but the result seems correct.

                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Hash Join  (cost=19229.08..29478.99 rows=106492 width=6)
   Hash Cond: (x.pcap_storeid = a.pcap_storeid)
   ->  Seq Scan on pcap_store x  (cost=0.00..5617.84 rows=212984 width=10)
   ->  Hash  (cost=17533.93..17533.93 rows=106492 width=4)
         ->  Hash Left Join  (cost=6371.19..17533.93 rows=106492 width=4)
               Hash Cond: (a.pcap_storeid = b.pcap_storeid)
               Filter: (b.pcap_storeid IS NULL)
               ->  Seq Scan on pcap_store a  (cost=0.00..5617.84
rows=212984 width=4)
               ->  Hash  (cost=3099.75..3099.75 rows=205475 width=4)
                     ->  Seq Scan on pcap_store_log b
(cost=0.00..3099.75 rows=205475 width=4)

I know the costs are just relative, but I assumed
cost=19229.08..29478.99 isn't 5 minutes of effort even on crappy
hardware.  Honestly, not complaining, 5 minutes is acceptable for this
query (it's a one time thing) just surprised is all.

Thanks for the help!

--
Aaron Turner
http://synfin.net/
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
    -- Benjamin Franklin

pgsql-performance by date:

Previous
From: Robert Haas
Date:
Subject: Re: High cost of ... where ... not in (select ...)
Next
From: Aaron Turner
Date:
Subject: Re: High cost of ... where ... not in (select ...)