Home > mailing lists

Re: [SQL] Subselect performance - Mailing list pgsql-sql

From	Tom Lane
Subject	Re: [SQL] Subselect performance
Date	September 21, 1999 10:17:33
Msg-id	21555.937923332@sss.pgh.pa.us Whole thread Raw
In response to	Subselect performance (Daniel Lopez <ridruejo@atm9.com.dtu.dk>)
Responses	Re: [SQL] Subselect performance
List	pgsql-sql

Tree view

Daniel Lopez <ridruejo@atm9.com.dtu.dk> writes:
> $list = select d from c
> select b from a where b in ( $list )
> is  5 seconds

> select b from a where b in (select d from c) 
> is 3 minutes!!  (although it should be at least as fast as a)!

Not necessarily.  Your first example is depending on the fact that
the "list" (number of values selected from c) is short.  Try it with
10000 or 100000 values from c, if you want to see the backend crash ;-)

The second case works OK even if the sub-select result is large, because
it re-executes the sub-select for each row from a (essentially,
rescanning c to see if b is matched).  However, this means the runtime
is proportional to the product of the number of rows in a and c.  Ugh.

Try rewriting your query as a join:
select a.b from a, c where a.b = c.d

(you might want "select distinct" here, if b can match many rows from d).
If the system can't figure out anything better than a nested-loop join,
then it'll probably end up taking about the same amount of time, but
this form at least allows the possibility of a smarter join method.

I believe we have a TODO list item to perform this sort of
transformation automatically when the sub-select is of a form that
allows it.  We need to get the left/outer join stuff working first
in order to have an exact match of the behavior.
        regards, tom lane

pgsql-sql by date:

From: "Mike Field"
Date: 21 September 1999, 09:16:31
Subject: Search Engine with PHP

From: "Jörg R. Rudnick"
Date: 21 September 1999, 10:50:34
Subject: SQL Functions on Composite Types - what's the purpose??

Re: [SQL] Subselect performance - Mailing list pgsql-sql

Previous

Next