Re: JOIN vs. WHERE ... IN (subselect) - Mailing list pgsql-sql

From Dean Gibson (DB Administrator)
Subject Re: JOIN vs. WHERE ... IN (subselect)
Date
Msg-id 5.1.0.14.2.20030517080543.02deabc8@imaps.ultimeth.net
Whole thread Raw
In response to Re: JOIN vs. WHERE ... IN (subselect)  (Chris Linstruth <cjl@QNET.COM>)
List pgsql-sql
The word 'subselect' is not a keyword but just an alias;  see the syntax under SELECT in the docs under 'SQL Commands'
(7.3.2):<br/><br /> | ( <i>select</i> ) [ AS ] <i>alias</i> [ ( <i>column_alias_list</i> ) ] <br /><br /><br /> Chris
Linstruthwrote on 2003-05-17 07:45:<br /><blockquote cite="cite" class="cite" type="cite">I was just given the
following. Not your exact query but replacing<br /> WHERE IN with JOIN ... AS SUBSELECT fixed my performance
problems.<br/> This query wnet from 15-30 seconds to under a second.<br /><br /> I have attempted to research exactly
whatMr. Arnold did, and found<br /> nothing regarding JOIN ... AS SUBSELECT in the docs.  Any pointers<br />
appreciated.<br/><br /> On Sun, 11 May 2003, Mathieu Arnold wrote:<br /> ><br /> ><br /> > --Le 11/05/2003
09:48-0700, Chris Linstruth \xe9crivait :<br /> ><br /> > |<br /> > | But this can take 15-30 seconds:<br />
>|<br /> > | SELECT count(radacctid) AS sessions,<br /> > |         sum(acctsessiontime) AS connecttime<br />
>|         FROM radacct<br /> > |         WHERE radacctid IN<br /> > |         (SELECT DISTINCT ON
(acctsessionid)radacctid FROM radacct<br /> > |           WHERE username='cjl'<br /> > |           AND
acctstoptimeIS NOT NULL<br /> > |           AND date_trunc('month', now())=date_trunc('month',<br /> > |
acctstoptime));<br/> > |<br /> > | There are probably many different ways to perform this query.  My<br /> > |
mainproblem is trying to overcome the fact that try as I might,<br /> > | I can't stop the occasional duplicate
accountingrecord from being<br /> > | inserted so I have to weed them out, hence the "DISTINCT ON<br /> > |
(acctsessionid)".<br/> ><br /> > try this :<br /> > SELECT count(radacctid) AS sessions,<br /> >        
sum(acctsessiontime)AS connecttime<br /> >         FROM radacct<br /> >              JOIN (SELECT DISTINCT ON
(acctsessionid)radacctid FROM radacct<br /> >                    WHERE username='cjl'<br /> >                   
ANDacctstoptime IS NOT NULL<br /> >                    AND date_trunc('month', now())=date_trunc('month',<br /> >
acctstoptime))AS subselect USING (radacctid);<br /> ><br /><br /> -- <br /> Chris Linstruth <cjl@qnet.com><br
/>QNET<br /> 1529 East Palmdale Blvd Suite 200<br /> Palmdale, CA 93550<br /> (661) 538-2028<br /><br /><br /> On Fri,
16May 2003, Dean Gibson (DB Administrator) wrote:<br /><br /> > Using PostgreSQL 7.3.2 on Linux.<br /> ><br />
>One of the types of queries that I like to do on our database is:<br /> ><br /> > SELECT * FROM table1<br />
>    WHERE indexed_column1 IN<br /> >     (SELECT column2 FROM table2 WHERE <condition>);<br /> ><br />
>However, in our database table1 is quite large (~1M rows), and the above query takes "forever", EVEN IF table2
CONSISTSOF ONLY ONE ROW!<br /> ><br /> > However, my third-party SQL book says that the above is equivalent
to:<br/> ><br /> > SELECT table1.* FROM table1, table2<br /> >     WHERE indexed_column1 = column2 AND
<condition>;<br/> ><br /> > And indeed, the above query executes virtually instantaneously if
"<condition>"results in a small number of rows from table2.<br /> ><br /> > I'm fairly new to SQL;  are
theyreally equivalent?  If so, is there some reason that the first form is not optimized/transformed into the second
form,or is this a planned enhancement?<br /> ><br /> > -- Dean<br /> ><br /> > ps: If indexed_column1 has
thesame name as column2, then the query:<br /> ><br /> > SELECT table1.* from table1<br /> >     JOIN table2
USING(column )<br /> >     WHERE <condition>;<br /> ><br /> > Also executes quickly.  I just like the
readabilityof the first query above, but as they say, you gotta do what works ...<br /> ><br /> ><br /> >
---------------------------(endof broadcast)---------------------------<br /> > TIP 5: Have you checked our
extensiveFAQ?<br /> ><br /> > <a eudora="autourl"
href="http://www.postgresql.org/docs/faqs/FAQ.html">http://www.postgresql.org/docs/faqs/FAQ.html</a><br/>
></blockquote>

pgsql-sql by date:

Previous
From: Chris Linstruth
Date:
Subject: Re: JOIN vs. WHERE ... IN (subselect)
Next
From: Stephan Szabo
Date:
Subject: Re: JOIN vs. WHERE ... IN (subselect)