Thread: ARRAY(subquery) volatility

ARRAY(subquery) volatility

From

Michael Fuhr

Date:

16 August 2005, 03:04:23

Why does the first query below return the same value for each row
while the second query returns random values?  Planner optimization?

test=> SELECT ARRAY(SELECT random()) FROM generate_series(1, 5);     ?column?       
---------------------{0.269273371561092}{0.269273371561092}{0.269273371561092}{0.269273371561092}{0.269273371561092}
(5 rows)

test=> SELECT ARRAY(SELECT random() + x * 0) FROM generate_series(1, 5) AS g(x);     ?column?       
---------------------{0.826863945846848}{0.42534113182935}{0.36419924318986}{0.258920902972538}{0.843205466327819}
(5 rows)

-- 
Michael Fuhr

Re: ARRAY(subquery) volatility

From

Joe Conway

Date:

16 August 2005, 04:39:20

Michael Fuhr wrote:
> Why does the first query below return the same value for each row
> while the second query returns random values?  Planner optimization?

I assume it is due to some kind of flattening in the planner, but it is 
totally unrelated to ARRAY(subquery):

regression=# SELECT (SELECT random()) as f FROM generate_series(1, 5);         f
------------------- 0.752416231088534 0.752416231088534 0.752416231088534 0.752416231088534 0.752416231088534
(5 rows)

regression=# SELECT (SELECT random() + 0 * f) as f FROM 
generate_series(1, 5) as t(f);         f
------------------- 0.176055165555354 0.608546747178094  0.55303416240636 0.127355110425202  0.21671894063089
(5 rows)

Here's another example:

regression=# create table t1(f text);
CREATE TABLE
regression=# insert into t1 values('0');
INSERT 17366 1
regression=# insert into t1 values('0');
INSERT 17367 1
regression=# insert into t1 values('0');
INSERT 17368 1
regression=# SELECT (SELECT timeofday()::timestamp + f::interval) as f 
FROM t1;             f
---------------------------- 2005-08-16 16:38:12.738215 2005-08-16 16:38:12.738292 2005-08-16 16:38:12.738315
(3 rows)

regression=# SELECT (SELECT timeofday()::timestamp) as f FROM t1;             f
---------------------------- 2005-08-16 16:35:33.100791 2005-08-16 16:35:33.100791 2005-08-16 16:35:33.100791
(3 rows)

This shows that it isn't related to use of an SRF either.

Joe

Re: ARRAY(subquery) volatility

From

Tom Lane

Date:

16 August 2005, 10:56:08

Michael Fuhr <mike@fuhr.org> writes:
> Why does the first query below return the same value for each row
> while the second query returns random values?  Planner optimization?

> test=> SELECT ARRAY(SELECT random()) FROM generate_series(1, 5);
> test=> SELECT ARRAY(SELECT random() + x * 0) FROM generate_series(1, 5) AS g(x);

The sub-SELECT in the first one is considered an uncorrelated subquery,
so you get a plan that evaluates the subquery just once:
Function Scan on generate_series  (cost=0.01..12.51 rows=1000 width=0)  InitPlan    ->  Result  (cost=0.00..0.01 rows=1
width=0)

In the second case, x is an outer parameter to the subquery, so it has
to be re-evaluated for each row of the outer query:
Function Scan on generate_series g  (cost=0.00..32.50 rows=1000 width=4)  SubPlan    ->  Result  (cost=0.00..0.02
rows=1width=0)
 

Note the "InitPlan" vs "SubPlan" labels --- they look similar, but the
evaluation rules are totally different.

The fact that there's a volatile function in the subquery isn't
considered while making this decision.  I'm not sure if it should be.
        regards, tom lane