Re: [SQL] uncorrelated subqueries - Mailing list pgsql-sql

From Tom Lane
Subject Re: [SQL] uncorrelated subqueries
Date
Msg-id 461.931462206@sss.pgh.pa.us
Whole thread Raw
In response to uncorrelated subqueries  (Bruce Lowery <bruce.lowery@edventions.com>)
Responses Re: [SQL] uncorrelated subqueries  (Herouth Maoz <herouth@oumail.openu.ac.il>)
List pgsql-sql
Bruce Lowery <bruce.lowery@edventions.com> writes:
>   In v6.4.2 do uncorrelated subqueries get run on each iteration of the
> outer loop?

It looks like the particular example you give is done that way, but the
system does know about uncorrelated subqueries.  For example, using 6.5:

explain SELECT a FROM table1 WHERE table1.b=33 AND
table1.c  = ( SELECT d  FROM  table2 WHERE table2.e=44);
NOTICE:  QUERY PLAN:
Seq Scan on table1  (cost=43.00 rows=1 width=4) InitPlan   ->  Seq Scan on table2  (cost=43.00 rows=1 width=4)

explain SELECT a FROM table1 WHERE table1.b=33 AND
table1.c  = ( SELECT d  FROM  table2 WHERE table2.e=table1.a);
NOTICE:  QUERY PLAN:
Seq Scan on table1  (cost=43.00 rows=1 width=4) SubPlan   ->  Seq Scan on table2  (cost=43.00 rows=1 width=4)

You can see that we get an "InitPlan" (ie, run once) for an uncorrelated
subquery but a "SubPlan" (repeat each time) for a correlated one.
Unfortunately, the case you care about is:

explain SELECT a FROM table1 WHERE table1.b=33 AND
table1.c  IN ( SELECT d  FROM  table2 WHERE table2.e=44);
NOTICE:  QUERY PLAN:
Seq Scan on table1  (cost=43.00 rows=1 width=4) SubPlan   ->  Seq Scan on table2  (cost=43.00 rows=1 width=4)

The main problem that would have to be solved to convert this to
an InitPlan is what to do if the subselect returns a huge number
of tuples ... with the current implementation, since we scan the
tuples one at a time, there's no problem, but if we try to store
all the tuples we could run out of memory.
        regards, tom lane


pgsql-sql by date:

Previous
From: Bruce Lowery
Date:
Subject: uncorrelated subqueries
Next
From: "Gilson Costa"
Date:
Subject: Referential Integrity