Thread: RE: Evaluating Subselects

RE: Evaluating Subselects

From
Jeff Eckermann
Date:
This would be acceptable to both PG and MS databases:

select p.* 
from purchasemaster as p inner join itemmaster as i on p.transactionid =
i.transactionid
where i.itempriority = 2;

> -----Original Message-----
> From:    Josh Berkus [SMTP:josh@agliodbs.com]
> Sent:    Friday, June 29, 2001 10:03 AM
> To:    Kapil Tilwani; pgsql-sql@postgresql.org
> Subject:    Evaluating Subselects
> 
> Kapil,
> 
> > I have a query like
> > select * from purchasemaster where transactionid in (select
> > transactionid from itemmaster where itempriority = 2)
> 
> That's a fine query.  You could even take it to meet your relatives.
> :-)
> 
> > ****I need this query for MS-Access, MS-SQL and Postgres.  Because
> > the application is supposed to be such that for upto 2-3 users, the
> > application would be running on MS-Access or MSDE, while for heavier
> > databases i.e., greater than 4 , the ideal database would be Postgres
> 
> 1. Why?  In what way is MS Access or MSDE better for a 2-3 user
> database?  You may wish to re-evaluate your development strategy.
> 
> 2. You're into a world of hurt, my friend.  MSDE has a number of SQL
> compliance lapses; MS Access is not SQL92 compliant at all, except for
> the simplest SELECT queries. (I say this as someone who gets paid to
> develop MS Access/VB)
>  
> > Only the problem is ...  (correct me if i am wrong, PLEEEEEEEEEEEASE)
> > that for each tuple in purchasemaster the sub-query is re-evaluated
> > (True or False????).  What I am looking at is whether it is possible
> > to execute the query in the sub-query just once, get its value as in
> > "(1,2,5,23,7,90)", etc. in place of the sub-query and accordingly,
> > execute the main query.
> 
> Depends on which database you're talking about.  PostgreSQL, and, in
> theory, MSDE, will only evaluate the IN() expression once (it's
> correlated sub-selects that get evaluated multiple times -- and you
> can't do these in Access at all).
> 
> Access, however, lacks a temporary table space.  SInce it has to
> evaluate the sub-select entirely in RAM, it evaluates it for every row
> in the main table, unless both tables are quite small.  As a result,
> IN() queries run like molasses in Access.  And Access doesn't support
> any alternate subselect structures at all.
> 
> This does bring up an interesting question for Tom and Stephan:
> 
> Which is more efficient, under what circumstances?
> 
> 1. SELECT a.* FROM a WHERE a.1 IN (SELECT b.1 FROM b WHERE b.2 = 0);
> 
> 2. SELECT a.* FROM a WHERE EXISTS (
>     SELECT b.1 FROM b WHERE b.2 = 0 AND b.1 = a.1);
> 
> 3. SELECT a.* FROM a JOIN 
>     (SELECT b.1 FROM b WHERE b.2 = 0) sub_b 
>     ON a.1 = sub_b.1;
> 
> I do a lot of #3 because it's easier to return calculated expressions
> from the sub-query that way, but am I shooting myself in the foot,
> performance-wise?
> 
> -Josh Berkus
> 
> P.S.  Kapil, you might want to consider buying O'Reilly's "SQL in a
> Nutshell" for cross-platform SQL comparisons.
> 
> ______AGLIO DATABASE SOLUTIONS___________________________
>                                        Josh Berkus
>   Complete information technology      josh@agliodbs.com
>    and data management solutions       (415) 565-7293
>   for law firms, small businesses        fax 621-2533
>     and non-profit organizations.      San Francisco
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly