Re: BUG #15485: Order by of inlineable stable function incorrect - Mailing list pgsql-bugs
From | Andrew Gierth |
---|---|
Subject | Re: BUG #15485: Order by of inlineable stable function incorrect |
Date | |
Msg-id | 87tvkx2j3w.fsf@news-spur.riddles.org.uk Whole thread Raw |
In response to | BUG #15485: Order by of inlineable stable function incorrect (PG Bug reporting form <noreply@postgresql.org>) |
List | pgsql-bugs |
>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes: This isn't a bug; please don't abuse the bug report system. PG> I am encountering a problem where the resultset is not correctly PG> ordered (and that results in incorrect behaviour of my PG> application). The ordering is done by a table function that is PG> referenced in the "from clause" (function "fA"). The query also PG> left joins various other table functions. These table functions do PG> not perform any ordering. Nothing has ever guaranteed that the order of rows returned by a table function is preserved in the query result (in practice, it is preserved ONLY if there are no joins, no grouping, no aggregates, no window functions, no set operations, no DISTINCT, and (obviously) no other ORDER BY clause, but even that should not normally be relied on since it only happens by accident). More generally: the ONLY thing that guarantees the order of rows in a query result is an _explicit_ ORDER BY clause _at the topmost query level_. No ORDER BY clause anywhere else in the query counts; the planner is completely free to reorder rows as it sees fit. To get the result you seem to want, you need to do: SELECT ... FROM fA(1) WITH ORDINALITY AS fA LEFT OUTER JOIN fB(fA.c2) AS fB ON true LEFT OUTER JOIN fC(fB.c3) AS fC ON true LEFT OUTER JOIN fD(fB.c2) AS fD ON true LEFT OUTER JOIN fE(fD.c2) AS fE ON true LEFT OUTER JOIN fF(fD.c3) AS fF ON true LEFT OUTER JOIN fG(fA.c3) AS fG ON true LEFT OUTER JOIN fH(fA.c5) AS fH ON true ORDER BY fA.ordinality; The planner knows that the result of the fA() call is already sorted by its ordinality column, so it will only do an explicit additional sort if it estimates that the query will be cheaper that way. There is one slight complication with this approach, which is that the use of WITH ORDINALITY will block inlining of the function (see https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions for details). If you need to preserve inlinability of the function fA(), then instead of using WITH ORDINALITY you will instead need to duplicate the actual ORDER BY condition used inside the function in an ORDER BY at the top query level (the planner will know that the function result is already ordered this way IF the function was in fact inlined). You can check if a FROM-clause function was inlined by looking at the EXPLAIN output: a FunctionScan node for the function is present only if the function was _not_ inlined. PG> I expect column "fA_c6" to be ordered ascending, but instead it has PG> an undefined order, or even a descending order. PG> What I have tried/discovered so far: PG> - The order is consistent over multiple runs This is pure coincidence. PG> - The order is not ignored (as in, the order is different than a PG> query without any order specification would return) This is also pure coincidence. PG> - If I mark the functions as "volatile", the resultset is ordered PG> correctly This is also pure coincidence, caused by the fact that marking the functions volatile prevents them from being inlined, forcing the planner to use explicit FunctionScans and blocking various performance optimizations that would incidentally affect the order of rows. You still CANNOT assume that this will be true in the general case - the planner is free to reorder the result as it chooses. PG> - If I remove certain columns from the "select clause", the PG> resultset is ordered correctly. I have not found any pattern, it is PG> not like one specific column causes the behaviour. This is also pure coincidence. PG> - If I only execute "fA", the resultset is ordered correctly If you do a simple "select * from fA(1)" then the planner simply has nothing else to add to the query plan after constructing the function call (whether as a FunctionScan or an inlined query), so the result order is coincidentally unaffected. PG> The way I understand the function volatility categories, my PG> functions are allowed to be "stable". From the documentation I PG> understand that using table functions in left joins like this is PG> allowed. It is allowed. A table function (any function used as if it were a table in a FROM-clause, whether declared RETURNS SETOF/TABLE or not) can be used in any way that a base table can be. Why did you think that (a) function volatility had anything to do with this, and (b) that the result order of your function would be preserved? PG> I am aware of the existence of "row from", but am unable to pass PG> arguments to the functions using that. ROWS FROM serves a different purpose: to call multiple functions (which can have parameters, but not parameters referring to results of other functions in the same construct) with their results returned as if joined by a FULL JOIN USING (ordinality) - that is, the first result row is the first row of each function result, the second result row is the second row of each result, and so on. -- Andrew (irc:RhodiumToad)
pgsql-bugs by date: