Hi List,
Note beforehand: this question is a result of a stack-exchange that can be seen here:
I'm often using the WHERE clause random() > 0.5
to pick a random subset of my data. Now I noticed that when using a set-returning function in a sub-query, I either get the whole set or none (meaning that the WHERE random() > 0.5 clause is interpreted before the set is being generated). e.g.:
SELECT num
FROM ( SELECT unnest(Array[1,2,3,4,5,6,7,8,9,10]) num
) AS foo
WHERE random() > 0.5;
This seems inconsistent because the following query does take the whole set into account:
SELECT num
FROM ( SELECT * FROM unnest(Array[1,2,3,4,5,6,7,8,9,10]) num
) AS foo
WHERE random() > 0.5;
So does this one:
WITH foo AS ( SELECT unnest(Array[1,2,3,4,5,6,7,8,9,10]) num
)
SELECT num
FROM foo
WHERE random() > 0.5;
Could anyone reflect on the seeming inconsistency here? I do understand that the planner sees the queries quite different (as can be seen from an EXPLAIN) but I don't understand the rationale behind it.
Notes:
couldn't find another function to test apart from random(), but likely there is some
I tested with generate_series and as well
My real use case works with postgis and pgpointcloud where a range of set-returning functions is used in this manner
Thanks,
Tom