Thread: about the subselect query

about the subselect query

From
Oskar Liljeblad
Date:
Thanks to everyone who replied to my email about a
subselect query that was rather slow (1-3mins). The query is:
  SELECT *    FROM items    WHERE package IN      (SELECT package         FROM items           WHERE ...user search
expression...        GROUP BY package)
 

The fastest query I could come up with that is equivalent
to the one above is
   SELECT DISTINCT i1.*     FROM items i1, items i2     WHERE i1.package = i2.package       AND ...user search
expression...

which finishes in about 3s. EXPLAIN gives the following info on
the query (if "search expression" is a regexp on non-indexed
column i1.performer):
 Unique  (cost=1334.61 rows=2 width=151)   ->  Sort  (cost=1334.61 rows=2 width=151)         ->  Nested Loop
(cost=1334.61rows=2 width=151)               ->  Seq Scan on items i2  (cost=1332.56 rows=1 width=12)               ->
IndexScan using items_packages on items i1        (cost=2.05 rows=12805 width=139)
 

But it is still faster if I run N+1 queries - first one SELECT
query to list packages, then another SELECT for each row in 
the result.

Oskar Liljeblad (osk@hem.passagen.se)