Thread: Re: Using the IN clauise

Re: Using the IN clauise

From
Date:
My post didn't go through the first time, so here goes again:
>
>
> I thought I read somewhere that it was better to avoid using
> the IN clause (at least when that means doing a subselect)
> for efficiency reasons, but I cannot find it on the website now.
>
> Does anyone know where that is?
>
> And can someone confirm the following:
>
> This:
> SELECT t1.f1
> FROM t1
> WHERE t1.f1 IN (select f1 from t2 AS t2 where t2.f1 = t1.f1
> AND t2.f2 = 'v1')
>
> is generally slower to run then:
> SELECT t1.f1
> FROM t1, t2
> WHERE t1.f1 = t2.f1
>   AND t2.f3 = 'v1'
>
> Thanks
>
> Terry Fielder
> Network Engineer
> Great Gulf Homes / Ashton Woods Homes
> terry@greatgulfhomes.com
>
>

Re: Using the IN clauise

From
Robert Treat
Date:
On Fri, 2002-10-25 at 14:40, terry@ashtonwoodshomes.com wrote:
> My post didn't go through the first time, so here goes again:
> >
> >
> > I thought I read somewhere that it was better to avoid using
> > the IN clause (at least when that means doing a subselect)
> > for efficiency reasons, but I cannot find it on the website now.
> >
> > Does anyone know where that is?
> >

Do a search on the interactive docs for IN or EXISTS, I think it's
mentioned there.  Either way it is generally recommended to use the
EXISTS grammer rather than the IN for performance reasons. When in
doubt, explain analyze is your friend.


> > And can someone confirm the following:
> >
> > This:
> > SELECT t1.f1
> > FROM t1
> > WHERE t1.f1 IN (select f1 from t2 AS t2 where t2.f1 = t1.f1
> > AND t2.f2 = 'v1')
> >
> > is generally slower to run then:
> > SELECT t1.f1
> > FROM t1, t2
> > WHERE t1.f1 = t2.f1
> >   AND t2.f3 = 'v1'
> >

I would believe it to be true, though depending on your table / data
structure I can't say 100% percent true.  Try running explain analyze on
both queries and see what you return.

> > Thanks
> >
> > Terry Fielder
> > Network Engineer
> > Great Gulf Homes / Ashton Woods Homes
> > terry@greatgulfhomes.com
> >

Robert Treat



Re: Using the IN clauise

From
terry@greatgulfhomes.com
Date:
I could not find any doc on this, so:

In case anyone is interested, I did a test, and here are my results:
I created the tables like this:
<cfloop from="1" to="1000000" index="variables.idx">
    <cfquery name="test_insert" datasource="PostgreSQL">
        INSERT INTO test_in_1 (f1, v1) VALUES ('#variables.idx#',
'#Chr(RandRange(65, 90))#')
    </cfquery>
    <cfquery name="test_insert" datasource="PostgreSQL">
        INSERT INTO test_in_2 (f1, v2) VALUES ('#variables.idx#',
'#Chr(RandRange(65, 90))#')
    </cfquery>
</cfloop>

And when I ran the queries (twice to ensure that the first one wasn't worse
because of disk being cached for the second query to run on)

test_in (Records=501, Time=6279ms)
SQL =
SELECT test_in_1.v1
    FROM   test_in_1
    WHERE  test_in_1.f1 IN (SELECT test_in_2.f1 FROM test_in_2 WHERE
test_in_2.v2 = 'A');

test_in (Records=501, Time=113ms)
SQL =
SELECT  test_in_1.v1
    FROM    test_in_1, test_in_2
    WHERE   test_in_1.f1 = test_in_2.f1
        AND test_in_2.v2 = 'A'

test_in (Records=501, Time=5739ms)
SQL =
SELECT test_in_1.v1
    FROM   test_in_1
    WHERE  test_in_1.f1 IN (SELECT test_in_2.f1 FROM test_in_2 WHERE
test_in_2.v2 = 'A');

test_in (Records=501, Time=70ms)
SQL =
SELECT  test_in_1.v1
    FROM    test_in_1, test_in_2
    WHERE   test_in_1.f1 = test_in_2.f1
        AND test_in_2.v2 = 'A'


Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com



> -----Original Message-----
> From: Robert Treat [mailto:xzilla@users.sourceforge.net]
> Sent: Friday, October 25, 2002 2:52 PM
> To: terry@ashtonwoodshomes.com
> Cc: Postgres (E-mail)
> Subject: Re: [GENERAL] Using the IN clauise
>
>
> On Fri, 2002-10-25 at 14:40, terry@ashtonwoodshomes.com wrote:
> > My post didn't go through the first time, so here goes again:
> > >
> > >
> > > I thought I read somewhere that it was better to avoid using
> > > the IN clause (at least when that means doing a subselect)
> > > for efficiency reasons, but I cannot find it on the website now.
> > >
> > > Does anyone know where that is?
> > >
>
> Do a search on the interactive docs for IN or EXISTS, I think it's
> mentioned there.  Either way it is generally recommended to use the
> EXISTS grammer rather than the IN for performance reasons. When in
> doubt, explain analyze is your friend.
>
>
> > > And can someone confirm the following:
> > >
> > > This:
> > > SELECT t1.f1
> > > FROM t1
> > > WHERE t1.f1 IN (select f1 from t2 AS t2 where t2.f1 = t1.f1
> > > AND t2.f2 = 'v1')
> > >
> > > is generally slower to run then:
> > > SELECT t1.f1
> > > FROM t1, t2
> > > WHERE t1.f1 = t2.f1
> > >   AND t2.f3 = 'v1'
> > >
>
> I would believe it to be true, though depending on your table / data
> structure I can't say 100% percent true.  Try running explain
> analyze on
> both queries and see what you return.
>
> > > Thanks
> > >
> > > Terry Fielder
> > > Network Engineer
> > > Great Gulf Homes / Ashton Woods Homes
> > > terry@greatgulfhomes.com
> > >
>
> Robert Treat
>