Thread: Fix erroneous parallel execution when modifying CTE is present in rewritten query

Fix erroneous parallel execution when modifying CTE is present in rewritten query

From
Greg Nancarrow
Date:
Hi Hackers,

There is a known bug in the query rewriter: if a query that has a modifying CTE is re-written, the hasModifyingCTE flag is not getting set in the re-written query. This bug can result in the query being allowed to execute in parallel-mode, which results in an error.

For more details from a previous discussion about this, and a test case that illustrates the issue, refer to:
https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com 

As a proposal to fix this problem, I've attached a patch which:

1) Copies the associated hasModifyingCTE and hasRecursive flags when the rewriter combines CTE lists (using Tom Lane's initial patch code seen in [1]). This flag copying is missing from the current Postgres code.
2) Adds an error case to specifically disallow the case of applying an INSERT...SELECT rule action to a command with a data-modifying CTE. This is because in this case, the rewritten query will actually end up having a data-modifying CTE that is not at the top level (as it is associated with the SELECT subquery part), which is not actually allowed by Postgres if that query is entered normally (as it's the parser that contains the error-check to ensure that the modifying CTE is at the top level, so this case avoids detection in the rewriter).
3) Modifies the existing test case in with.sql that tests the merging of an outer CTE with a CTE in a rule action (as currently that rule action is using INSERT...SELECT).


For the record, a workaround for this issue (at least addressing how hasModifyingCTE is meant to exclude the query from parallel execution) has been suggested in the past, but was not well received. It is the following addition to the max_parallel_hazard_walker() function:

+             /*
+             * ModifyingCTE expressions are treated as parallel-unsafe.
+             *
+             * XXX Normally, if the Query has a modifying CTE, the hasModifyingCTE
+             * flag is set in the Query tree, and the query will be regarded as
+             * parallel-usafe. However, in some cases, a re-written query with a
+             * modifying CTE does not have that flag set, due to a bug in the query
+             * rewriter. The following else-if is a workaround for this bug, to detect
+             * a modifying CTE in the query and regard it as parallel-unsafe. This
+             * comment, and the else-if block immediately below, may be removed once
+             * the bug in the query rewriter is fixed.
+             */
+             else if (IsA(node, CommonTableExpr))
+             {
+                            CommonTableExpr *cte = (CommonTableExpr *) node;
+                            Query       *ctequery = castNode(Query, cte->ctequery);
+
+                            if (ctequery->commandType != CMD_SELECT)
+                            {
+                                           context->max_hazard = PROPARALLEL_UNSAFE;
+                                           return true;
+                            }
+             }
+


Regards,
Greg Nancarrow
Fujitsu Australia

Attachment
Greg Nancarrow <gregn4422@gmail.com> writes:
> [ v1-0001-Propagate-CTE-property-flags-when-copying-a-CTE-list.patch ]

Pushed with a couple of adjustments:

* I rewrote the comment, mostly so as to include an explanation of how
the error could be removed, in case anyone ever wants to go to the
trouble.

* The existing test case can be fixed up without fundamentally changing
what it's testing, by replacing INSERT...SELECT with INSERT...VALUES.
(That should likely also be our first suggestion to any complainers.)

Thanks for the patch!

            regards, tom lane