> On Tue, 21 Sep 1999, Tom Lane wrote:
>
> > Daniel Lopez <ridruejo@atm9.com.dtu.dk> writes:
> > > $list = select d from c
> > > select b from a where b in ( $list )
> > > is 5 seconds
> >
> > > select b from a where b in (select d from c)
> > > is 3 minutes!! (although it should be at least as fast as a)!
> >
> > Not necessarily. Your first example is depending on the fact that
> > the "list" (number of values selected from c) is short. Try it with
> > 10000 or 100000 values from c, if you want to see the backend crash ;-)
>
> I've encoutered this sort of issue myself where I just wanted the
> sub-select to be performed once. Granted it would not work if you wanted
> to select 10000 or 100000 but what if you have a very larged table a and a
> very small table c (using the example above). As you pointed out,
> currently you're looking at 'a x c' runtime... Ugh indeed; whereas just
> executing the subselect once and cut and pasting that you have an order of
> 'a' runtime...
OK, I am jumping in here, because it seems we have some strange
behavour.
The only subselect problem I know of is that:
select b from a where b in (select d from c)
will execute the subquery just once, but will do a sequential scan for
of the subquery results for each row of 'a' looking for 'b' that is in
the set of result rows.
This is a major performance problem, one that is known, and one that
should be fixed, but I am sounding like a broken record.
The solution is to allow the subquery results to be mergejoined(sorted),
or hashjoined with the outer query.
Am I correct, or is something else going on here?
-- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610)
853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill,
Pennsylvania19026