Evaluating Subselects - Mailing list pgsql-sql
From | Josh Berkus |
---|---|
Subject | Evaluating Subselects |
Date | |
Msg-id | web-79070@davinci.ethosmedia.com Whole thread Raw |
In response to | ("Kapil Tilwani" <karan_pg_2@yahoo.com>) |
List | pgsql-sql |
Kapil, > I have a query like > select * from purchasemaster where transactionid in (select > transactionid from itemmaster where itempriority = 2) That's a fine query. You could even take it to meet your relatives. :-) > ****I need this query for MS-Access, MS-SQL and Postgres. Because > the application is supposed to be such that for upto 2-3 users, the > application would be running on MS-Access or MSDE, while for heavier > databases i.e., greater than 4 , the ideal database would be Postgres 1. Why? In what way is MS Access or MSDE better for a 2-3 user database? You may wish to re-evaluate your development strategy. 2. You're into a world of hurt, my friend. MSDE has a number of SQL compliance lapses; MS Access is not SQL92 compliant at all, except for the simplest SELECT queries. (I say this as someone who gets paid to develop MS Access/VB) > Only the problem is ... (correct me if i am wrong, PLEEEEEEEEEEEASE) > that for each tuple in purchasemaster the sub-query is re-evaluated > (True or False????). What I am looking at is whether it is possible > to execute the query in the sub-query just once, get its value as in > "(1,2,5,23,7,90)", etc. in place of the sub-query and accordingly, > execute the main query. Depends on which database you're talking about. PostgreSQL, and, in theory, MSDE, will only evaluate the IN() expression once (it's correlated sub-selects that get evaluated multiple times -- and you can't do these in Access at all). Access, however, lacks a temporary table space. SInce it has to evaluate the sub-select entirely in RAM, it evaluates it for every row in the main table, unless both tables are quite small. As a result, IN() queries run like molasses in Access. And Access doesn't support any alternate subselect structures at all. This does bring up an interesting question for Tom and Stephan: Which is more efficient, under what circumstances? 1. SELECT a.* FROM a WHERE a.1 IN (SELECT b.1 FROM b WHERE b.2 = 0); 2. SELECT a.* FROM a WHERE EXISTS (SELECT b.1 FROM b WHERE b.2 = 0 AND b.1 = a.1); 3. SELECT a.* FROM a JOIN (SELECT b.1 FROM b WHERE b.2 = 0) sub_b ON a.1 = sub_b.1; I do a lot of #3 because it's easier to return calculated expressions from the sub-query that way, but am I shooting myself in the foot, performance-wise? -Josh Berkus P.S. Kapil, you might want to consider buying O'Reilly's "SQL in a Nutshell" for cross-platform SQL comparisons. ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco