Re: BUG #15485: Order by of inlineable stable function incorrect - Mailing list pgsql-bugs

From Andrew Gierth
Subject Re: BUG #15485: Order by of inlineable stable function incorrect
Date
Msg-id 87tvkx2j3w.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to BUG #15485: Order by of inlineable stable function incorrect  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes:

This isn't a bug; please don't abuse the bug report system.

 PG> I am encountering a problem where the resultset is not correctly
 PG> ordered (and that results in incorrect behaviour of my
 PG> application). The ordering is done by a table function that is
 PG> referenced in the "from clause" (function "fA"). The query also
 PG> left joins various other table functions. These table functions do
 PG> not perform any ordering.

Nothing has ever guaranteed that the order of rows returned by a table
function is preserved in the query result (in practice, it is preserved
ONLY if there are no joins, no grouping, no aggregates, no window
functions, no set operations, no DISTINCT, and (obviously) no other
ORDER BY clause, but even that should not normally be relied on since it
only happens by accident).

More generally: the ONLY thing that guarantees the order of rows in a
query result is an _explicit_ ORDER BY clause _at the topmost query
level_. No ORDER BY clause anywhere else in the query counts; the
planner is completely free to reorder rows as it sees fit.

To get the result you seem to want, you need to do:

SELECT
    ...
FROM fA(1) WITH ORDINALITY AS fA
     LEFT OUTER JOIN fB(fA.c2) AS fB ON true
     LEFT OUTER JOIN fC(fB.c3) AS fC ON true
     LEFT OUTER JOIN fD(fB.c2) AS fD ON true
     LEFT OUTER JOIN fE(fD.c2) AS fE ON true
     LEFT OUTER JOIN fF(fD.c3) AS fF ON true
     LEFT OUTER JOIN fG(fA.c3) AS fG ON true
     LEFT OUTER JOIN fH(fA.c5) AS fH ON true
ORDER BY fA.ordinality;

The planner knows that the result of the fA() call is already sorted by
its ordinality column, so it will only do an explicit additional sort if
it estimates that the query will be cheaper that way.

There is one slight complication with this approach, which is that the
use of WITH ORDINALITY will block inlining of the function (see
https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions for details).
If you need to preserve inlinability of the function fA(), then instead
of using WITH ORDINALITY you will instead need to duplicate the actual
ORDER BY condition used inside the function in an ORDER BY at the top
query level (the planner will know that the function result is already
ordered this way IF the function was in fact inlined).

You can check if a FROM-clause function was inlined by looking at the
EXPLAIN output: a FunctionScan node for the function is present only if
the function was _not_ inlined.

 PG> I expect column "fA_c6" to be ordered ascending, but instead it has
 PG> an undefined order, or even a descending order.

 PG> What I have tried/discovered so far:
 PG> - The order is consistent over multiple runs

This is pure coincidence.

 PG> - The order is not ignored (as in, the order is different than a
 PG> query without any order specification would return)

This is also pure coincidence.

 PG> - If I mark the functions as "volatile", the resultset is ordered
 PG> correctly

This is also pure coincidence, caused by the fact that marking the
functions volatile prevents them from being inlined, forcing the planner
to use explicit FunctionScans and blocking various performance
optimizations that would incidentally affect the order of rows. You
still CANNOT assume that this will be true in the general case - the
planner is free to reorder the result as it chooses.

 PG> - If I remove certain columns from the "select clause", the
 PG> resultset is ordered correctly. I have not found any pattern, it is
 PG> not like one specific column causes the behaviour.

This is also pure coincidence.

 PG> - If I only execute "fA", the resultset is ordered correctly

If you do a simple "select * from fA(1)" then the planner simply has
nothing else to add to the query plan after constructing the function
call (whether as a FunctionScan or an inlined query), so the result
order is coincidentally unaffected. 

 PG> The way I understand the function volatility categories, my
 PG> functions are allowed to be "stable". From the documentation I
 PG> understand that using table functions in left joins like this is
 PG> allowed.

It is allowed. A table function (any function used as if it were a table
in a FROM-clause, whether declared RETURNS SETOF/TABLE or not) can be
used in any way that a base table can be.

Why did you think that (a) function volatility had anything to do with
this, and (b) that the result order of your function would be preserved?

 PG> I am aware of the existence of "row from", but am unable to pass
 PG> arguments to the functions using that.

ROWS FROM serves a different purpose: to call multiple functions (which
can have parameters, but not parameters referring to results of other
functions in the same construct) with their results returned as if
joined by a FULL JOIN USING (ordinality) - that is, the first result row
is the first row of each function result, the second result row is the
second row of each result, and so on.

-- 
Andrew (irc:RhodiumToad)


pgsql-bugs by date:

Previous
From: Gavin Flower
Date:
Subject: Re: BUG #15485: Order by of inlineable stable function incorrect
Next
From: Jeff Janes
Date:
Subject: Re: Wrong aggregate result when sorting by a NULL value