On Tue, 21 Sep 1999, Bruce Momjian wrote:
> OK, I am jumping in here, because it seems we have some strange
> behavour.
>
> The only subselect problem I know of is that:
>
> select b from a where b in (select d from c)
>
> will execute the subquery just once, but will do a sequential scan for
> of the subquery results for each row of 'a' looking for 'b' that is in
> the set of result rows.
Oh, OK, that's very possible. I was always under the impression (very
possibly missguided) that the reason it took a long time to do a "in
(select...)" was that the sub-select was actually executed for every row
in 'a' so that you ended up doing:
1x sequential scan of a
ax select on c
whereas if you did the sub-select ide[endently and cut-and-pasted the
obtained set into the "in (...)" you were in point of fact just doing:
1X sequential scan of a (each of them with loads of OR statements).
therefore saving "ax select" time.
Bruce, I appologise if I've completely missunderstood what's going on and
that your e-mail was all about correcting me. I don't have a good grasp
of seq-scan vs. (nested-)joins vs. hash joins vs. mergejoins etc.
(although any pointers on where to get a crash course in these would be
greatly appreciated).
> This is a major performance problem, one that is known, and one that
> should be fixed, but I am sounding like a broken record.
yeah, again appologise if this has been discussed to death in the past and
I missed it all (or it went over my head ;) )
> The solution is to allow the subquery results to be mergejoined(sorted),
> or hashjoined with the outer query.
erm...
> Am I correct, or is something else going on here?
most probably correct... :)
regards,
S.