Peter Eisentraut <e99re41@DoCS.UU.SE> writes:
>> I dont know too much about the PG internals, but when I used sybase,
>> it would usually execute the sub-select independently and stuff the
>> results into a temp table and then do another query, joining to the
>> results of the sub-select.
> Last time I checked PostgreSQL executes the subquery for each row.
> Apparently it must still be doing that
It did up until last Wednesday. If Brian retries his example with
current sources I think he'll see better performance. But I still
want to poke into exactly why the indexscan implementation seems so
much slower than the prior seqscan+sort implementation; that doesn't
seem right. (And if it is right, why doesn't the optimizer realize it?)
I'll get back to Brian on that.
> and I do suspect that it is right
> in the overall sense because the subquery may have side effects. Consider
> SELECT * FROM t1 WHERE id IN (select nextval('my_sequence'))
> Of course this query makes absolutely no sense whatsoever but perhaps
> there are similar ones where it does.
Interesting example. But since the tuples in t1 are not guaranteed to
be scanned in any particular order, it seems to me that a query that
has side-effects in WHERE inherently has undefined results. If we could
detect side-effect-producing expressions (which we cannot, currently,
and in general I suspect that problem is undecidable) I would argue that
we ought to reject this query. I certainly don't want to constrain the
optimizer by assuming that repeated executions of subqueries can't be
optimized away.
regards, tom lane