Home > mailing lists

Evaluating Subselects - Mailing list pgsql-sql

From	Josh Berkus
Subject	Evaluating Subselects
Date	June 29, 2001 11:34:20
Msg-id	web-79070@davinci.ethosmedia.com Whole thread Raw
In response to	("Kapil Tilwani" <karan_pg_2@yahoo.com>)
List	pgsql-sql

Tree view

Kapil,

> I have a query like
> select * from purchasemaster where transactionid in (select
> transactionid from itemmaster where itempriority = 2)

That's a fine query.  You could even take it to meet your relatives.
:-)

> ****I need this query for MS-Access, MS-SQL and Postgres.  Because
> the application is supposed to be such that for upto 2-3 users, the
> application would be running on MS-Access or MSDE, while for heavier
> databases i.e., greater than 4 , the ideal database would be Postgres

1. Why?  In what way is MS Access or MSDE better for a 2-3 user
database?  You may wish to re-evaluate your development strategy.

2. You're into a world of hurt, my friend.  MSDE has a number of SQL
compliance lapses; MS Access is not SQL92 compliant at all, except for
the simplest SELECT queries. (I say this as someone who gets paid to
develop MS Access/VB)
> Only the problem is ...  (correct me if i am wrong, PLEEEEEEEEEEEASE)
> that for each tuple in purchasemaster the sub-query is re-evaluated
> (True or False????).  What I am looking at is whether it is possible
> to execute the query in the sub-query just once, get its value as in
> "(1,2,5,23,7,90)", etc. in place of the sub-query and accordingly,
> execute the main query.

Depends on which database you're talking about.  PostgreSQL, and, in
theory, MSDE, will only evaluate the IN() expression once (it's
correlated sub-selects that get evaluated multiple times -- and you
can't do these in Access at all).

Access, however, lacks a temporary table space.  SInce it has to
evaluate the sub-select entirely in RAM, it evaluates it for every row
in the main table, unless both tables are quite small.  As a result,
IN() queries run like molasses in Access.  And Access doesn't support
any alternate subselect structures at all.

This does bring up an interesting question for Tom and Stephan:

Which is more efficient, under what circumstances?

1. SELECT a.* FROM a WHERE a.1 IN (SELECT b.1 FROM b WHERE b.2 = 0);

2. SELECT a.* FROM a WHERE EXISTS (SELECT b.1 FROM b WHERE b.2 = 0 AND b.1 = a.1);

3. SELECT a.* FROM a JOIN (SELECT b.1 FROM b WHERE b.2 = 0) sub_b ON a.1 = sub_b.1;

I do a lot of #3 because it's easier to return calculated expressions
from the sub-query that way, but am I shooting myself in the foot,
performance-wise?

-Josh Berkus

P.S.  Kapil, you might want to consider buying O'Reilly's "SQL in a
Nutshell" for cross-platform SQL comparisons.

______AGLIO DATABASE SOLUTIONS___________________________                                      Josh Berkus Complete
informationtechnology      josh@agliodbs.com  and data management solutions       (415) 565-7293 for law firms, small
businesses       fax 621-2533   and non-profit organizations.      San Francisco

pgsql-sql by date:

From: "Josh Berkus"
Date: 29 June 2001, 10:43:10
Subject: Re: day difference

From: Tom Lane
Date: 29 June 2001, 11:40:56
Subject: Re: NOTIFY "string" from rule

Evaluating Subselects - Mailing list pgsql-sql

Previous

Next