Re: JOIN vs. WHERE ... IN (subselect) - Mailing list pgsql-sql
From | Dean Gibson (DB Administrator) |
---|---|
Subject | Re: JOIN vs. WHERE ... IN (subselect) |
Date | |
Msg-id | 5.1.0.14.2.20030517080543.02deabc8@imaps.ultimeth.net Whole thread Raw |
In response to | Re: JOIN vs. WHERE ... IN (subselect) (Chris Linstruth <cjl@QNET.COM>) |
List | pgsql-sql |
The word 'subselect' is not a keyword but just an alias; see the syntax under SELECT in the docs under 'SQL Commands' (7.3.2):<br/><br /> | ( <i>select</i> ) [ AS ] <i>alias</i> [ ( <i>column_alias_list</i> ) ] <br /><br /><br /> Chris Linstruthwrote on 2003-05-17 07:45:<br /><blockquote cite="cite" class="cite" type="cite">I was just given the following. Not your exact query but replacing<br /> WHERE IN with JOIN ... AS SUBSELECT fixed my performance problems.<br/> This query wnet from 15-30 seconds to under a second.<br /><br /> I have attempted to research exactly whatMr. Arnold did, and found<br /> nothing regarding JOIN ... AS SUBSELECT in the docs. Any pointers<br /> appreciated.<br/><br /> On Sun, 11 May 2003, Mathieu Arnold wrote:<br /> ><br /> ><br /> > --Le 11/05/2003 09:48-0700, Chris Linstruth \xe9crivait :<br /> ><br /> > |<br /> > | But this can take 15-30 seconds:<br /> >|<br /> > | SELECT count(radacctid) AS sessions,<br /> > | sum(acctsessiontime) AS connecttime<br /> >| FROM radacct<br /> > | WHERE radacctid IN<br /> > | (SELECT DISTINCT ON (acctsessionid)radacctid FROM radacct<br /> > | WHERE username='cjl'<br /> > | AND acctstoptimeIS NOT NULL<br /> > | AND date_trunc('month', now())=date_trunc('month',<br /> > | acctstoptime));<br/> > |<br /> > | There are probably many different ways to perform this query. My<br /> > | mainproblem is trying to overcome the fact that try as I might,<br /> > | I can't stop the occasional duplicate accountingrecord from being<br /> > | inserted so I have to weed them out, hence the "DISTINCT ON<br /> > | (acctsessionid)".<br/> ><br /> > try this :<br /> > SELECT count(radacctid) AS sessions,<br /> > sum(acctsessiontime)AS connecttime<br /> > FROM radacct<br /> > JOIN (SELECT DISTINCT ON (acctsessionid)radacctid FROM radacct<br /> > WHERE username='cjl'<br /> > ANDacctstoptime IS NOT NULL<br /> > AND date_trunc('month', now())=date_trunc('month',<br /> > acctstoptime))AS subselect USING (radacctid);<br /> ><br /><br /> -- <br /> Chris Linstruth <cjl@qnet.com><br />QNET<br /> 1529 East Palmdale Blvd Suite 200<br /> Palmdale, CA 93550<br /> (661) 538-2028<br /><br /><br /> On Fri, 16May 2003, Dean Gibson (DB Administrator) wrote:<br /><br /> > Using PostgreSQL 7.3.2 on Linux.<br /> ><br /> >One of the types of queries that I like to do on our database is:<br /> ><br /> > SELECT * FROM table1<br /> > WHERE indexed_column1 IN<br /> > (SELECT column2 FROM table2 WHERE <condition>);<br /> ><br /> >However, in our database table1 is quite large (~1M rows), and the above query takes "forever", EVEN IF table2 CONSISTSOF ONLY ONE ROW!<br /> ><br /> > However, my third-party SQL book says that the above is equivalent to:<br/> ><br /> > SELECT table1.* FROM table1, table2<br /> > WHERE indexed_column1 = column2 AND <condition>;<br/> ><br /> > And indeed, the above query executes virtually instantaneously if "<condition>"results in a small number of rows from table2.<br /> ><br /> > I'm fairly new to SQL; are theyreally equivalent? If so, is there some reason that the first form is not optimized/transformed into the second form,or is this a planned enhancement?<br /> ><br /> > -- Dean<br /> ><br /> > ps: If indexed_column1 has thesame name as column2, then the query:<br /> ><br /> > SELECT table1.* from table1<br /> > JOIN table2 USING(column )<br /> > WHERE <condition>;<br /> ><br /> > Also executes quickly. I just like the readabilityof the first query above, but as they say, you gotta do what works ...<br /> ><br /> ><br /> > ---------------------------(endof broadcast)---------------------------<br /> > TIP 5: Have you checked our extensiveFAQ?<br /> ><br /> > <a eudora="autourl" href="http://www.postgresql.org/docs/faqs/FAQ.html">http://www.postgresql.org/docs/faqs/FAQ.html</a><br/> ></blockquote>