Undocumented Order By vs Target List Volatile Function Behavior - Mailing list pgsql-hackers

From David G. Johnston
Subject Undocumented Order By vs Target List Volatile Function Behavior
Date
Msg-id CAKFQuwZ3-XGfcS+CLTAYvPx3ARYjUxv+=YL8sOicV0nda=T5cA@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hey,

This came up today on twitter as a claimed POLA violation:

postgres=# select random(), random() order by random();
       random        |       random
---------------------+---------------------
 0.08176638503720679 | 0.08176638503720679
(1 row)

Which was explained long ago by Tom as:


The parser makes it behave equivalent to:

SELECT random() AS foo ORDER BY foo;

Which apparently extends to any column, even aliased ones, that use the same expression:

postgres=# select random() as foo, random() as foo2 order by foo;
        foo         |        foo2
--------------------+--------------------
 0.7334292196943459 | 0.7334292196943459
(1 row)

The documentation does say:

"A query using a volatile function will re-evaluate the function at every row where its value is needed."


That sentence is insufficient to explain why, without the order by, the system chooses to evaluate random() twice, while with order by it does so only once.

I propose extending the existing ORDER BY paragraph in the SELECT Command Reference as follows:

"A limitation of this feature is that an ORDER BY clause applying to the result of a UNION, INTERSECT, or EXCEPT clause can only specify an output column name or number, not an expression."

Add:

A side-effect of this feature is that ORDER BY expressions containing volatile functions will execute the volatile function only once for the entire row; thus any column expressions using the same function will reuse the same function result.  By way of example, note the output differences for the following two queries:

postgres=# select random() as foo, random()*1 as foo2 from generate_series(1,2) order by foo;
        foo         |        foo2
--------------------+--------------------
 0.2631492904302788 | 0.2631492904302788
 0.9019166692448664 | 0.9019166692448664
(2 rows)

postgres=# select random() as foo, random() as foo2 from generate_series(1,2);
        foo         |        foo2
--------------------+--------------------
 0.7763978178239725 | 0.3569212477832773
 0.7360531822096732 | 0.7028952103643864
(2 rows)

David J.

pgsql-hackers by date:

Previous
From: "Jonathan S. Katz"
Date:
Subject: Re: Handle infinite recursion in logical replication setup
Next
From: Robert Haas
Date:
Subject: Re: System column support for partitioned tables using heap