Thread: BUG #18536: Using WITH inside WITH RECURSIVE triggers a "shouldn't happen" error

The following bug has been logged on the website:

Bug reference:      18536
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 17beta2
Operating system:   Ubuntu 22.04
Description:

The following query:
WITH RECURSIVE t(n) AS (
    WITH t1 AS (SELECT 1 FROM t) SELECT 1
    UNION
    SELECT 1 FROM t1)
SELECT * FROM t;

triggers an error:
ERROR:  XX000: missing recursive reference
LOCATION:  checkWellFormedRecursion, parse_cte.c:896

which is seemingly not expected:
        if (cstate->selfrefcount != 1)  /* shouldn't happen */
            elog(ERROR, "missing recursive reference");


PG Bug reporting form <noreply@postgresql.org> writes:
> The following query:
> WITH RECURSIVE t(n) AS (
>     WITH t1 AS (SELECT 1 FROM t) SELECT 1
>     UNION
>     SELECT 1 FROM t1)
> SELECT * FROM t;

That should throw an error, certainly: it's not a valid recursive
structure.  (Since the inner WITH clause spans the whole
"SELECT 1 UNION SELECT 1 FROM t1" structure, we don't have a top-
level UNION anymore.)  But it shouldn't throw this error:

> ERROR:  XX000: missing recursive reference
> LOCATION:  checkWellFormedRecursion, parse_cte.c:896

We do get the right behaviors for WITHs that are down inside one
side or the other of the UNION:

WITH RECURSIVE t(n) AS (
    (WITH t1 AS (SELECT 1 FROM t) SELECT 1 FROM t1)
    UNION
    SELECT 1)
SELECT * FROM t;
ERROR:  recursive reference to query "t" must not appear within its non-recursive term
LINE 2:     (WITH t1 AS (SELECT 1 FROM t) SELECT 1 FROM t1)
                                       ^

WITH RECURSIVE t(n) AS (
  SELECT 1
  UNION
  (WITH t1 AS (SELECT 1 FROM t) SELECT 1 FROM t1))
SELECT * FROM t;
 n
---
 1
(1 row)

I think the case you show should be throwing

ERROR:  recursive query "t" does not have the form non-recursive-term UNION [ALL] recursive-term

Will look closer later.  Thanks for the report.

            regards, tom lane



I wrote:
> I think the case you show should be throwing
> ERROR:  recursive query "t" does not have the form non-recursive-term UNION [ALL] recursive-term

Hmm, that is probably too strong: it will break some queries we've
historically accepted.  What we need is just to forbid self-references
within the WITH clause.  The code actually does that already, it's
just doing it too late; so we can fix this with a simple re-ordering
of the error checks, as attached.

            regards, tom lane

diff --git a/src/backend/parser/parse_cte.c b/src/backend/parser/parse_cte.c
index 6826d4f36a..17432e9df6 100644
--- a/src/backend/parser/parse_cte.c
+++ b/src/backend/parser/parse_cte.c
@@ -877,6 +877,25 @@ checkWellFormedRecursion(CteState *cstate)
                             cte->ctename),
                      parser_errposition(cstate->pstate, cte->location)));

+        /*
+         * Really, we should insist that there not be a top-level WITH, since
+         * syntactically that would enclose the UNION.  However, we've not
+         * done so in the past and it's probably too late to change.  Settle
+         * for insisting that WITH not contain a self-reference.  Test this
+         * before examining the UNION arms, to avoid issuing confusing errors
+         * in such cases.
+         */
+        if (stmt->withClause)
+        {
+            cstate->curitem = i;
+            cstate->innerwiths = NIL;
+            cstate->selfrefcount = 0;
+            cstate->context = RECURSION_SUBLINK;
+            checkWellFormedRecursionWalker((Node *) stmt->withClause->ctes,
+                                           cstate);
+            Assert(cstate->innerwiths == NIL);
+        }
+
         /* The left-hand operand mustn't contain self-reference at all */
         cstate->curitem = i;
         cstate->innerwiths = NIL;
@@ -895,18 +914,6 @@ checkWellFormedRecursion(CteState *cstate)
         if (cstate->selfrefcount != 1)    /* shouldn't happen */
             elog(ERROR, "missing recursive reference");

-        /* WITH mustn't contain self-reference, either */
-        if (stmt->withClause)
-        {
-            cstate->curitem = i;
-            cstate->innerwiths = NIL;
-            cstate->selfrefcount = 0;
-            cstate->context = RECURSION_SUBLINK;
-            checkWellFormedRecursionWalker((Node *) stmt->withClause->ctes,
-                                           cstate);
-            Assert(cstate->innerwiths == NIL);
-        }
-
         /*
          * Disallow ORDER BY and similar decoration atop the UNION. These
          * don't make sense because it's impossible to figure out what they
diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out
index b4f3121751..fff2d4949e 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -2029,6 +2029,38 @@ WITH RECURSIVE x(n) AS (SELECT n FROM x UNION ALL SELECT 1)
 ERROR:  recursive reference to query "x" must not appear within its non-recursive term
 LINE 1: WITH RECURSIVE x(n) AS (SELECT n FROM x UNION ALL SELECT 1)
                                               ^
+-- allow this, because we historically have
+WITH RECURSIVE x(n) AS (
+  WITH x1 AS (SELECT 1 AS n)
+    SELECT 0
+    UNION
+    SELECT * FROM x1)
+    SELECT * FROM x;
+ n
+---
+ 0
+ 1
+(2 rows)
+
+-- but this should be rejected
+WITH RECURSIVE x(n) AS (
+  WITH x1 AS (SELECT 1 FROM x)
+    SELECT 0
+    UNION
+    SELECT * FROM x1)
+    SELECT * FROM x;
+ERROR:  recursive reference to query "x" must not appear within a subquery
+LINE 2:   WITH x1 AS (SELECT 1 FROM x)
+                                    ^
+-- and this too
+WITH RECURSIVE x(n) AS (
+  (WITH x1 AS (SELECT 1 FROM x) SELECT * FROM x1)
+  UNION
+  SELECT 0)
+    SELECT * FROM x;
+ERROR:  recursive reference to query "x" must not appear within its non-recursive term
+LINE 2:   (WITH x1 AS (SELECT 1 FROM x) SELECT * FROM x1)
+                                     ^
 CREATE TEMPORARY TABLE y (a INTEGER);
 INSERT INTO y SELECT generate_series(1, 10);
 -- LEFT JOIN
diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql
index 3fb0b33fce..06b2d4857f 100644
--- a/src/test/regress/sql/with.sql
+++ b/src/test/regress/sql/with.sql
@@ -908,6 +908,29 @@ WITH RECURSIVE x(n) AS (SELECT n FROM x)
 WITH RECURSIVE x(n) AS (SELECT n FROM x UNION ALL SELECT 1)
     SELECT * FROM x;

+-- allow this, because we historically have
+WITH RECURSIVE x(n) AS (
+  WITH x1 AS (SELECT 1 AS n)
+    SELECT 0
+    UNION
+    SELECT * FROM x1)
+    SELECT * FROM x;
+
+-- but this should be rejected
+WITH RECURSIVE x(n) AS (
+  WITH x1 AS (SELECT 1 FROM x)
+    SELECT 0
+    UNION
+    SELECT * FROM x1)
+    SELECT * FROM x;
+
+-- and this too
+WITH RECURSIVE x(n) AS (
+  (WITH x1 AS (SELECT 1 FROM x) SELECT * FROM x1)
+  UNION
+  SELECT 0)
+    SELECT * FROM x;
+
 CREATE TEMPORARY TABLE y (a INTEGER);
 INSERT INTO y SELECT generate_series(1, 10);


I wrote:
> Hmm, that is probably too strong: it will break some queries we've
> historically accepted.  What we need is just to forbid self-references
> within the WITH clause.  The code actually does that already, it's
> just doing it too late; so we can fix this with a simple re-ordering
> of the error checks, as attached.

Oh ...

regression=# WITH RECURSIVE x(n) AS (
select 0 union select 1 order by (select n from x)) select * from x;
ERROR:  missing recursive reference

We have to move *all* of those subsidiary-clause checks to before
the tests of the UNION proper.

            regards, tom lane

diff --git a/src/backend/parser/parse_cte.c b/src/backend/parser/parse_cte.c
index 6826d4f36a..de9ae9b483 100644
--- a/src/backend/parser/parse_cte.c
+++ b/src/backend/parser/parse_cte.c
@@ -877,25 +877,14 @@ checkWellFormedRecursion(CteState *cstate)
                             cte->ctename),
                      parser_errposition(cstate->pstate, cte->location)));

-        /* The left-hand operand mustn't contain self-reference at all */
-        cstate->curitem = i;
-        cstate->innerwiths = NIL;
-        cstate->selfrefcount = 0;
-        cstate->context = RECURSION_NONRECURSIVETERM;
-        checkWellFormedRecursionWalker((Node *) stmt->larg, cstate);
-        Assert(cstate->innerwiths == NIL);
-
-        /* Right-hand operand should contain one reference in a valid place */
-        cstate->curitem = i;
-        cstate->innerwiths = NIL;
-        cstate->selfrefcount = 0;
-        cstate->context = RECURSION_OK;
-        checkWellFormedRecursionWalker((Node *) stmt->rarg, cstate);
-        Assert(cstate->innerwiths == NIL);
-        if (cstate->selfrefcount != 1)    /* shouldn't happen */
-            elog(ERROR, "missing recursive reference");
-
-        /* WITH mustn't contain self-reference, either */
+        /*
+         * Really, we should insist that there not be a top-level WITH, since
+         * syntactically that would enclose the UNION.  However, we've not
+         * done so in the past and it's probably too late to change.  Settle
+         * for insisting that WITH not contain a self-reference.  Test this
+         * before examining the UNION arms, to avoid issuing confusing errors
+         * in such cases.
+         */
         if (stmt->withClause)
         {
             cstate->curitem = i;
@@ -912,7 +901,9 @@ checkWellFormedRecursion(CteState *cstate)
          * don't make sense because it's impossible to figure out what they
          * mean when we have only part of the recursive query's results. (If
          * we did allow them, we'd have to check for recursive references
-         * inside these subtrees.)
+         * inside these subtrees.  As for WITH, we have to do this before
+         * examining the UNION arms, to avoid issuing confusing errors if
+         * there is a recursive reference here.)
          */
         if (stmt->sortClause)
             ereport(ERROR,
@@ -938,6 +929,28 @@ checkWellFormedRecursion(CteState *cstate)
                      errmsg("FOR UPDATE/SHARE in a recursive query is not implemented"),
                      parser_errposition(cstate->pstate,
                                         exprLocation((Node *) stmt->lockingClause))));
+
+        /*
+         * Now we can get on with checking the UNION operands themselves.
+         *
+         * The left-hand operand mustn't contain a self-reference at all.
+         */
+        cstate->curitem = i;
+        cstate->innerwiths = NIL;
+        cstate->selfrefcount = 0;
+        cstate->context = RECURSION_NONRECURSIVETERM;
+        checkWellFormedRecursionWalker((Node *) stmt->larg, cstate);
+        Assert(cstate->innerwiths == NIL);
+
+        /* Right-hand operand should contain one reference in a valid place */
+        cstate->curitem = i;
+        cstate->innerwiths = NIL;
+        cstate->selfrefcount = 0;
+        cstate->context = RECURSION_OK;
+        checkWellFormedRecursionWalker((Node *) stmt->rarg, cstate);
+        Assert(cstate->innerwiths == NIL);
+        if (cstate->selfrefcount != 1)    /* shouldn't happen */
+            elog(ERROR, "missing recursive reference");
     }
 }

diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out
index b4f3121751..08cfa5463f 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -2029,6 +2029,46 @@ WITH RECURSIVE x(n) AS (SELECT n FROM x UNION ALL SELECT 1)
 ERROR:  recursive reference to query "x" must not appear within its non-recursive term
 LINE 1: WITH RECURSIVE x(n) AS (SELECT n FROM x UNION ALL SELECT 1)
                                               ^
+-- allow this, because we historically have
+WITH RECURSIVE x(n) AS (
+  WITH x1 AS (SELECT 1 AS n)
+    SELECT 0
+    UNION
+    SELECT * FROM x1)
+    SELECT * FROM x;
+ n
+---
+ 0
+ 1
+(2 rows)
+
+-- but this should be rejected
+WITH RECURSIVE x(n) AS (
+  WITH x1 AS (SELECT 1 FROM x)
+    SELECT 0
+    UNION
+    SELECT * FROM x1)
+    SELECT * FROM x;
+ERROR:  recursive reference to query "x" must not appear within a subquery
+LINE 2:   WITH x1 AS (SELECT 1 FROM x)
+                                    ^
+-- and this too
+WITH RECURSIVE x(n) AS (
+  (WITH x1 AS (SELECT 1 FROM x) SELECT * FROM x1)
+  UNION
+  SELECT 0)
+    SELECT * FROM x;
+ERROR:  recursive reference to query "x" must not appear within its non-recursive term
+LINE 2:   (WITH x1 AS (SELECT 1 FROM x) SELECT * FROM x1)
+                                     ^
+-- and this
+WITH RECURSIVE x(n) AS (
+  SELECT 0 UNION SELECT 1
+  ORDER BY (SELECT n FROM x))
+    SELECT * FROM x;
+ERROR:  ORDER BY in a recursive query is not implemented
+LINE 3:   ORDER BY (SELECT n FROM x))
+                   ^
 CREATE TEMPORARY TABLE y (a INTEGER);
 INSERT INTO y SELECT generate_series(1, 10);
 -- LEFT JOIN
diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql
index 3fb0b33fce..8f6e6c0b40 100644
--- a/src/test/regress/sql/with.sql
+++ b/src/test/regress/sql/with.sql
@@ -908,6 +908,35 @@ WITH RECURSIVE x(n) AS (SELECT n FROM x)
 WITH RECURSIVE x(n) AS (SELECT n FROM x UNION ALL SELECT 1)
     SELECT * FROM x;

+-- allow this, because we historically have
+WITH RECURSIVE x(n) AS (
+  WITH x1 AS (SELECT 1 AS n)
+    SELECT 0
+    UNION
+    SELECT * FROM x1)
+    SELECT * FROM x;
+
+-- but this should be rejected
+WITH RECURSIVE x(n) AS (
+  WITH x1 AS (SELECT 1 FROM x)
+    SELECT 0
+    UNION
+    SELECT * FROM x1)
+    SELECT * FROM x;
+
+-- and this too
+WITH RECURSIVE x(n) AS (
+  (WITH x1 AS (SELECT 1 FROM x) SELECT * FROM x1)
+  UNION
+  SELECT 0)
+    SELECT * FROM x;
+
+-- and this
+WITH RECURSIVE x(n) AS (
+  SELECT 0 UNION SELECT 1
+  ORDER BY (SELECT n FROM x))
+    SELECT * FROM x;
+
 CREATE TEMPORARY TABLE y (a INTEGER);
 INSERT INTO y SELECT generate_series(1, 10);


Hi,

> triggers an error:
> ERROR:  XX000: missing recursive reference
> LOCATION:  checkWellFormedRecursion, parse_cte.c:896

FWIW I couldn't reproduce the reported error on REL_17_STABLE
(b8bf76cbde39). The error I got seems reasonable:

```
46087 (master) =# WITH RECURSIVE t(n) AS (
    WITH t1 AS (SELECT 1 FROM t) SELECT 1
    UNION
    SELECT 1 FROM t1)
SELECT * FROM t;
ERROR:  recursive reference to query "t" must not appear within a subquery
LINE 2:     WITH t1 AS (SELECT 1 FROM t) SELECT 1
                                      ^
```

We should add regression tests though, as v2 does.

-- 
Best regards,
Aleksander Alekseev



Hi,

> > triggers an error:
> > ERROR:  XX000: missing recursive reference
> > LOCATION:  checkWellFormedRecursion, parse_cte.c:896
>
> FWIW I couldn't reproduce the reported error on REL_17_STABLE
> (b8bf76cbde39). The error I got seems reasonable:
>
> ```
> 46087 (master) =# WITH RECURSIVE t(n) AS (
>     WITH t1 AS (SELECT 1 FROM t) SELECT 1
>     UNION
>     SELECT 1 FROM t1)
> SELECT * FROM t;
> ERROR:  recursive reference to query "t" must not appear within a subquery
> LINE 2:     WITH t1 AS (SELECT 1 FROM t) SELECT 1
>                                       ^
> ```
>
> We should add regression tests though, as v2 does.

Oops. That's because Tom pushed this already (cf588e10f664). Sorry for
the noise.

-- 
Best regards,
Aleksander Alekseev