Re: select IN problem - Mailing list pgsql-novice

From Andrew McMillan
Subject Re: select IN problem
Date
Msg-id 1014420996.3232.2530.camel@kant.mcmillan.net.nz
Whole thread Raw
In response to Re: select IN problem  (Doug Silver <dsilver@quantified.com>)
Responses Re: select IN problem
List pgsql-novice
On Sat, 2002-02-23 at 11:15, Doug Silver wrote:
> On 23 Feb 2002, Andrew McMillan wrote:
> >
> > SELECT td.transaction_id FROM transactions_detail td
> > WHERE EXISTS (SELECT transaction_id FROM transactions t
> >    WHERE t.transaction_id = td.transaction_id
> >      AND t.enter_date > CAST('2002-02-02' AS DATE );
> >
> > Could well work better.  The problem you are likely to be encountering
> > is that IN (...) will not use an index.
> >
> > To see the query plans generated by the different SQL, use 'EXPLAIN <sql
> > command>' - it is _well_ worth coming to grips with what EXPLAIN can
> > tell you.
> >
> > You could also be better with a plan that did a simple JOIN and
> > DISTINCT:
> >
> > SELECT DISTINCT td.transaction_id
> > FROM transactions_detail td, transactions t
> > WHERE t.enter_date > '2002-02-02'
> >   AND td.transaction_id = t.transaction_id;
> >
> > Regards,
> >                     Andrew.
> >
>
> Ok, the transactions table does have an index, so that must be the problem
> there, but should it give an error or will it eventually return something?

It will eventually return something.  My guess is that you probably have
a lot of records in one or both tables.

Do a "VACUUM ANALYZE" and then take a look at the EXPLAIN ... output for
the three queries to get a real example of the differences in execution
plans.  Analysis for queries with sub-plans is more complex than the
normal case, however.

What your query would have been doing (I think) is running the subselect
for every row in the transactions_detail table.  That would probably be
about the worst possible case you can imagine, hence the bad query time.

Check the archives of -hackers to see more information about problems
with making IN (...) use an index.


> Thanks, #2 worked as I should have tried something like that earlier.
> Still a bit rusty with my sql queries ...

We've all been there... :-)

Did my first suggestion not work at all?  I think that (in this case) #2
is probably the most efficient, but it's worth understanding the EXISTS
syntax as you can generally turn an IN (...) into EXISTS (...) and
sometimes it _is_ the best way.

Cheers,
                    Andrew.
--
--------------------------------------------------------------------
Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington
WEB: http://catalyst.net.nz/        PHYS: Level 2, 150-154 Willis St
DDI: +64(4)916-7201    MOB: +64(21)635-694    OFFICE: +64(4)499-2267
       Are you enrolled at http://schoolreunions.co.nz/ yet?


pgsql-novice by date:

Previous
From: Doug Silver
Date:
Subject: Re: select IN problem
Next
From: Doug Silver
Date:
Subject: Re: select IN problem