Joe Conway <mail@joeconway.com> wrote:
> "IN (subselect)" is notoriously slow (in fact it is an FAQ). Can you
> rewrite this as:
>
...
Stephan Szabo <sszabo@megazone23.bigpanda.com> wrote:
> Per FAQ suggestion, try something like
...
Thanks alot, below are the results on your suggestions, quite an
dramatic differance (but this is another box, faster, and running 7.3b2
so the 45 minutes doesn't hold here, but it took more than 10 minutes
before i stopped the original query).
Is this an todo item, or should every user figure this out (yeah i know
i should have read the FAQ when it went so totally bad).
The NOT IN it seems quite natural here, but then again, i don't think as
the db as you do :)
mag=> \timing
Timing is on.
mag=> explain analyze select count(gid) from bs where not exists (
select * from z2test where z2test.x=bs.gid );Aggregate (cost=129182.18..129182.18 rows=1 width=9) (actual
time=590.90..590.90 rows=1 loops=1) -> Seq Scan on bs (cost=0.00..129150.46 rows=12688 width=9) (actual
time=42.57..590.46 rows=524 loops=1) Filter: (NOT (subplan)) SubPlan -> Index Scan using
z2temp_x_idxon z2test (cost=0.00..5.07
rows=1 width=9) (actual time=0.02..0.02 rows=1 loops=25376) Index Cond: (x = $0)Total runtime: 591.01
msec
Time: 592.25 ms
mag=> EXPLAIN analyze select count(b.gid) from bs b left join z2test z
on z.x = b.gid where z.x IS NULL;Aggregate (cost=1703.65..1703.65 rows=1 width=18) (actual
time=370.31..370.31 rows=1 loops=1) -> Hash Join (cost=346.61..1640.21 rows=25376 width=18) (actual
time=75.45..369.91 rows=524 loops=1) Hash Cond: ("outer".gid = "inner".x) Filter: ("inner".x IS NULL)
-> Seq Scan on bs b (cost=0.00..595.76 rows=25376 width=9)
(actual time=0.01..34.20 rows=25376 loops=1) -> Hash (cost=298.29..298.29 rows=19329 width=9) (actual
time=43.82..43.82 rows=0 loops=1) -> Seq Scan on z2test z (cost=0.00..298.29 rows=19329
width=9) (actual time=0.02..22.69 rows=19329 loops=1)Total runtime: 370.42 msec
Time: 371.90 ms
mag=>
Magnus