Fix erroneous parallel execution when modifying CTE is present in rewritten query - Mailing list pgsql-hackers

From Greg Nancarrow
Subject Fix erroneous parallel execution when modifying CTE is present in rewritten query
Date
Msg-id CAJcOf-f68DT=26YAMz_i0+Au3TcLO5oiHY5=fL6Sfuits6r+_w@mail.gmail.com
Whole thread Raw
Responses Re: Fix erroneous parallel execution when modifying CTE is present in rewritten query  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi Hackers,

There is a known bug in the query rewriter: if a query that has a modifying CTE is re-written, the hasModifyingCTE flag is not getting set in the re-written query. This bug can result in the query being allowed to execute in parallel-mode, which results in an error.

For more details from a previous discussion about this, and a test case that illustrates the issue, refer to:
https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com 

As a proposal to fix this problem, I've attached a patch which:

1) Copies the associated hasModifyingCTE and hasRecursive flags when the rewriter combines CTE lists (using Tom Lane's initial patch code seen in [1]). This flag copying is missing from the current Postgres code.
2) Adds an error case to specifically disallow the case of applying an INSERT...SELECT rule action to a command with a data-modifying CTE. This is because in this case, the rewritten query will actually end up having a data-modifying CTE that is not at the top level (as it is associated with the SELECT subquery part), which is not actually allowed by Postgres if that query is entered normally (as it's the parser that contains the error-check to ensure that the modifying CTE is at the top level, so this case avoids detection in the rewriter).
3) Modifies the existing test case in with.sql that tests the merging of an outer CTE with a CTE in a rule action (as currently that rule action is using INSERT...SELECT).


For the record, a workaround for this issue (at least addressing how hasModifyingCTE is meant to exclude the query from parallel execution) has been suggested in the past, but was not well received. It is the following addition to the max_parallel_hazard_walker() function:

+             /*
+             * ModifyingCTE expressions are treated as parallel-unsafe.
+             *
+             * XXX Normally, if the Query has a modifying CTE, the hasModifyingCTE
+             * flag is set in the Query tree, and the query will be regarded as
+             * parallel-usafe. However, in some cases, a re-written query with a
+             * modifying CTE does not have that flag set, due to a bug in the query
+             * rewriter. The following else-if is a workaround for this bug, to detect
+             * a modifying CTE in the query and regard it as parallel-unsafe. This
+             * comment, and the else-if block immediately below, may be removed once
+             * the bug in the query rewriter is fixed.
+             */
+             else if (IsA(node, CommonTableExpr))
+             {
+                            CommonTableExpr *cte = (CommonTableExpr *) node;
+                            Query       *ctequery = castNode(Query, cte->ctequery);
+
+                            if (ctequery->commandType != CMD_SELECT)
+                            {
+                                           context->max_hazard = PROPARALLEL_UNSAFE;
+                                           return true;
+                            }
+             }
+


Regards,
Greg Nancarrow
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [PATCH] Disable bgworkers during servers start in pg_upgrade
Next
From: Michael Paquier
Date:
Subject: Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead