On 23 Feb 2002, Andrew McMillan wrote:
> It will eventually return something. My guess is that you probably have
> a lot of records in one or both tables.
>
> Do a "VACUUM ANALYZE" and then take a look at the EXPLAIN ... output for
> the three queries to get a real example of the differences in execution
> plans. Analysis for queries with sub-plans is more complex than the
> normal case, however.
>
> What your query would have been doing (I think) is running the subselect
> for every row in the transactions_detail table. That would probably be
> about the worst possible case you can imagine, hence the bad query time.
>
> Check the archives of -hackers to see more information about problems
> with making IN (...) use an index.
>
>
> > Thanks, #2 worked as I should have tried something like that earlier.
> > Still a bit rusty with my sql queries ...
>
> We've all been there... :-)
>
> Did my first suggestion not work at all? I think that (in this case) #2
> is probably the most efficient, but it's worth understanding the EXISTS
> syntax as you can generally turn an IN (...) into EXISTS (...) and
> sometimes it _is_ the best way.
>
> Cheers,
> Andrew.
>
yes, the exists statement worked but much slower. Both tables are quite
small (~2000 entries) so the delay was surprising.
I assume that's what this result corresponds to:
explain select transaction_id from transactions_detail
where transaction_id IN
(select transaction_id from transactions where enter_date>cast('2002-02-20' as date));
NOTICE: QUERY PLAN:
Seq Scan on transactions_detail (cost=0.00..84701.18 rows=2062 width=2)
SubPlan
-> Seq Scan on transactions (cost=0.00..82.11 rows=625 width=4)
EXPLAIN
So it's doing a sequential scan on the trans_detail against the result of the
SubPlan. ouch.
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Doug Silver
Network Manager
Quantified Systems, Inc
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~