Hi,
I'm confused about what we should allow in a recursive query. For example, in the following query:
WITH RECURSIVE t1 AS ( SELECT 1 UNION SELECT generate_series(2,3) FROM t1 ORDER BY 1 DESC) SELECT * FROM t1 ;
The parser attaches the "order by" clause to the "union" operator, and then we error out with the following message: "ORDER BY in a recursive query is not implemented"
The comment in the code (parser_cte.c:900) says "Disallow ORDER BY and similar decoration atop the UNION". Then, if we wrap the recursive clause around parentheses:
WITH RECURSIVE t1 AS ( SELECT 1 UNION (SELECT generate_series(2,3) FROM t1 ORDER BY 1 DESC)) SELECT * FROM t1 ;
It works as expected. So, do we support the ORDER BY in a recursive query or not? If the answer is yes, I suggest one of the following modifications:
1. Change the error message to something like "ORDER BY at the top level of a recursive query is not implemented. HINT: wrap the respective statement around ()"
2. (preferred) Modify the parser or simply remove these checks to allow the first query.
If the answer is no, then there is a minor bug that allows us to bypass the check. Even though the ORDER BY happens inside the working table, I think it can be a useful feature combined with LIMIT and OFFSET.
There is a similar restriction regarding GROUP BY. But in this case, the error message is very clear and it is consistent with the comment in the code. I suggest removing this restriction as well in order to improve PostgreSQL's capabilities to process graph data. For example, counting the number of paths in a DAG can be computed more efficiently using an aggregation in each step.
I don't know what the standard says about this, but it certainly does not allow DISTINCT ON in the recursive query, while PostgreSQL does support it. So, we could eventually skip the specs in this case to be more consistent since a DISTINCT ON has many similarities with a GROUP BY.
I did some tests, and it is enough to remove the check regarding the GROUP BY. The machinery to perform the GROUP BY in a recursive clause is already there.
Of course, if it is the case, I would be glad to send a patch.
Best regards,
Renan Fonseca