Thread: Allowing parallel-safe initplans

Allowing parallel-safe initplans

From

Tom Lane

Date:

12 April 2023, 16:43:52

Pursuant to the discussion at [1], here's a patch that removes our
old restriction that a plan node having initPlans can't be marked
parallel-safe (dating to commit ab77a5a45).  That was really a special
case of the fact that we couldn't transmit subplans to parallel
workers at all.  We fixed that in commit 5e6d8d2bb and follow-ons,
but this case never got addressed.

Along the way, this also takes care of some sloppiness about updating
path costs to match when we move initplans from one place to another
during createplan.c and setrefs.c.  Since all the planning decisions are
already made by that point, this is just cosmetic; but it seems good
to keep EXPLAIN output consistent with where the initplans are.

The diff in query_planner() might be worth remarking on.  I found
that one because after fixing things to allow parallel-safe initplans,
one partition_prune test case changed plans (as shown in the patch)
--- but only when debug_parallel_query was active.  The reason
proved to be that we only bothered to mark Result nodes as potentially
parallel-safe when debug_parallel_query is on.  This neglects the
fact that parallel-safety may be of interest for a sub-query even
though the Result itself doesn't parallelize.

There's only one existing test case that visibly changes plan with
these changes.  The new plan is clearly saner-looking than before,
and testing with some data loaded into the table confirms that it
is faster.  I'm not sure if it's worth devising more test cases.

I'll park this in the July commitfest.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/ZDVt6MaNWkRDO1LQ%40telsasoft.com

diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 910ffbf1e1..2ec8a537ae 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -6488,6 +6488,8 @@ materialize_finished_plan(Plan *subplan)
 {
     Plan       *matplan;
     Path        matpath;        /* dummy for result of cost_material */
+    Cost        initplan_cost;
+    bool        unsafe_initplans;

     matplan = (Plan *) make_material(subplan);

@@ -6495,20 +6497,25 @@ materialize_finished_plan(Plan *subplan)
      * XXX horrid kluge: if there are any initPlans attached to the subplan,
      * move them up to the Material node, which is now effectively the top
      * plan node in its query level.  This prevents failure in
-     * SS_finalize_plan(), which see for comments.  We don't bother adjusting
-     * the subplan's cost estimate for this.
+     * SS_finalize_plan(), which see for comments.
      */
     matplan->initPlan = subplan->initPlan;
     subplan->initPlan = NIL;

+    /* Move the initplans' cost delta, as well */
+    SS_compute_initplan_cost(matplan->initPlan,
+                             &initplan_cost, &unsafe_initplans);
+    subplan->startup_cost -= initplan_cost;
+    subplan->total_cost -= initplan_cost;
+
     /* Set cost data */
     cost_material(&matpath,
                   subplan->startup_cost,
                   subplan->total_cost,
                   subplan->plan_rows,
                   subplan->plan_width);
-    matplan->startup_cost = matpath.startup_cost;
-    matplan->total_cost = matpath.total_cost;
+    matplan->startup_cost = matpath.startup_cost + initplan_cost;
+    matplan->total_cost = matpath.total_cost + initplan_cost;
     matplan->plan_rows = subplan->plan_rows;
     matplan->plan_width = subplan->plan_width;
     matplan->parallel_aware = false;
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 7afd434c60..fcc0eacd25 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -112,14 +112,17 @@ query_planner(PlannerInfo *root,
                  * quals are parallel-restricted.  (We need not check
                  * final_rel->reltarget because it's empty at this point.
                  * Anything parallel-restricted in the query tlist will be
-                 * dealt with later.)  This is normally pretty silly, because
-                 * a Result-only plan would never be interesting to
-                 * parallelize.  However, if debug_parallel_query is on, then
-                 * we want to execute the Result in a parallel worker if
-                 * possible, so we must do this.
+                 * dealt with later.)  We should always do this in a subquery,
+                 * since it might be useful to use the subquery in parallel
+                 * paths in the parent level.  At top level this is normally
+                 * not worth the cycles, because a Result-only plan would
+                 * never be interesting to parallelize.  However, if
+                 * debug_parallel_query is on, then we want to execute the
+                 * Result in a parallel worker if possible, so we must check.
                  */
                 if (root->glob->parallelModeOK &&
-                    debug_parallel_query != DEBUG_PARALLEL_OFF)
+                    (root->query_level > 1 ||
+                     debug_parallel_query != DEBUG_PARALLEL_OFF))
                     final_rel->consider_parallel =
                         is_parallel_safe(root, parse->jointree->quals);

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 502ccbcea2..e196f06c07 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -430,17 +430,19 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
     /*
      * Optionally add a Gather node for testing purposes, provided this is
      * actually a safe thing to do.
+     *
+     * We could potentially do this even when top_plan has parallel-safe
+     * initPlans, but we'd have to move the initPlans to the Gather node
+     * because of SS_finalize_plan's limitations.  That causes cosmetic
+     * breakage of regression tests, because initPlans that would normally
+     * appear on the top_plan move to the Gather, changing EXPLAIN results.
+     * That doesn't seem worth working hard on, so skip it for now.
      */
-    if (debug_parallel_query != DEBUG_PARALLEL_OFF && top_plan->parallel_safe)
+    if (debug_parallel_query != DEBUG_PARALLEL_OFF &&
+        top_plan->parallel_safe && top_plan->initPlan == NIL)
     {
         Gather       *gather = makeNode(Gather);

-        /*
-         * Top plan must not have any initPlans, else it shouldn't have been
-         * marked parallel-safe.
-         */
-        Assert(top_plan->initPlan == NIL);
-
         gather->plan.targetlist = top_plan->targetlist;
         gather->plan.qual = NIL;
         gather->plan.lefttree = top_plan;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1812db7f2f..04e4fd1d2f 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -23,6 +23,7 @@
 #include "optimizer/pathnode.h"
 #include "optimizer/planmain.h"
 #include "optimizer/planner.h"
+#include "optimizer/subselect.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_relation.h"
 #include "tcop/utility.h"
@@ -1544,19 +1545,30 @@ clean_up_removed_plan_level(Plan *parent, Plan *child)
 {
     /*
      * We have to be sure we don't lose any initplans, so move any that were
-     * attached to the parent plan to the child.  If we do move any, the child
-     * is no longer parallel-safe.
+     * attached to the parent plan to the child.  If any are parallel-unsafe,
+     * the child is no longer parallel-safe.  As a cosmetic matter, also add
+     * the initplans' run costs to the child's costs.
      */
     if (parent->initPlan)
-        child->parallel_safe = false;
+    {
+        Cost        initplan_cost;
+        bool        unsafe_initplans;

-    /*
-     * Attach plans this way so that parent's initplans are processed before
-     * any pre-existing initplans of the child.  Probably doesn't matter, but
-     * let's preserve the ordering just in case.
-     */
-    child->initPlan = list_concat(parent->initPlan,
-                                  child->initPlan);
+        SS_compute_initplan_cost(parent->initPlan,
+                                 &initplan_cost, &unsafe_initplans);
+        child->startup_cost += initplan_cost;
+        child->total_cost += initplan_cost;
+        if (unsafe_initplans)
+            child->parallel_safe = false;
+
+        /*
+         * Attach plans this way so that parent's initplans are processed
+         * before any pre-existing initplans of the child.  Probably doesn't
+         * matter, but let's preserve the ordering just in case.
+         */
+        child->initPlan = list_concat(parent->initPlan,
+                                      child->initPlan);
+    }

     /*
      * We also have to transfer the parent's column labeling info into the
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 5f12b2ef9b..c9285ad625 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -1016,8 +1016,7 @@ SS_process_ctes(PlannerInfo *root)

         /*
          * CTE scans are not considered for parallelism (cf
-         * set_rel_consider_parallel), and even if they were, initPlans aren't
-         * parallel-safe.
+         * set_rel_consider_parallel).
          */
         splan->parallel_safe = false;
         splan->setParam = NIL;
@@ -2120,8 +2119,8 @@ SS_identify_outer_params(PlannerInfo *root)
  * If any initPlans have been created in the current query level, they will
  * get attached to the Plan tree created from whichever Path we select from
  * the given rel.  Increment all that rel's Paths' costs to account for them,
- * and make sure the paths get marked as parallel-unsafe, since we can't
- * currently transmit initPlans to parallel workers.
+ * and if any of the initPlans are parallel-unsafe, mark all the rel's Paths
+ * parallel-unsafe as well.
  *
  * This is separate from SS_attach_initplans because we might conditionally
  * create more initPlans during create_plan(), depending on which Path we
@@ -2132,6 +2131,7 @@ void
 SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)
 {
     Cost        initplan_cost;
+    bool        unsafe_initplans;
     ListCell   *lc;

     /* Nothing to do if no initPlans */
@@ -2140,17 +2140,10 @@ SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)

     /*
      * Compute the cost increment just once, since it will be the same for all
-     * Paths.  We assume each initPlan gets run once during top plan startup.
-     * This is a conservative overestimate, since in fact an initPlan might be
-     * executed later than plan startup, or even not at all.
+     * Paths.  Also check for parallel-unsafe initPlans.
      */
-    initplan_cost = 0;
-    foreach(lc, root->init_plans)
-    {
-        SubPlan    *initsubplan = (SubPlan *) lfirst(lc);
-
-        initplan_cost += initsubplan->startup_cost + initsubplan->per_call_cost;
-    }
+    SS_compute_initplan_cost(root->init_plans,
+                             &initplan_cost, &unsafe_initplans);

     /*
      * Now adjust the costs and parallel_safe flags.
@@ -2161,19 +2154,71 @@ SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)

         path->startup_cost += initplan_cost;
         path->total_cost += initplan_cost;
-        path->parallel_safe = false;
+        if (unsafe_initplans)
+            path->parallel_safe = false;
     }

     /*
-     * Forget about any partial paths and clear consider_parallel, too;
-     * they're not usable if we attached an initPlan.
+     * Adjust partial paths' costs too, or forget them entirely if we must
+     * consider the rel parallel-unsafe.
      */
-    final_rel->partial_pathlist = NIL;
-    final_rel->consider_parallel = false;
+    if (unsafe_initplans)
+    {
+        final_rel->partial_pathlist = NIL;
+        final_rel->consider_parallel = false;
+    }
+    else
+    {
+        foreach(lc, final_rel->partial_pathlist)
+        {
+            Path       *path = (Path *) lfirst(lc);
+
+            path->startup_cost += initplan_cost;
+            path->total_cost += initplan_cost;
+        }
+    }

     /* We needn't do set_cheapest() here, caller will do it */
 }

+/*
+ * SS_compute_initplan_cost - count up the cost delta for some initplans
+ *
+ * The total cost returned in *initplan_cost_p should be added to both the
+ * startup and total costs of the plan node the initplans get attached to.
+ * We also report whether any of the initplans are not parallel-safe.
+ *
+ * The primary user of this is SS_charge_for_initplans, but it's also
+ * used in adjusting costs when we move initplans to another plan node.
+ */
+void
+SS_compute_initplan_cost(List *init_plans,
+                         Cost *initplan_cost_p,
+                         bool *unsafe_initplans_p)
+{
+    Cost        initplan_cost;
+    bool        unsafe_initplans;
+    ListCell   *lc;
+
+    /*
+     * We assume each initPlan gets run once during top plan startup.  This is
+     * a conservative overestimate, since in fact an initPlan might be
+     * executed later than plan startup, or even not at all.
+     */
+    initplan_cost = 0;
+    unsafe_initplans = false;
+    foreach(lc, init_plans)
+    {
+        SubPlan    *initsubplan = lfirst_node(SubPlan, lc);
+
+        initplan_cost += initsubplan->startup_cost + initsubplan->per_call_cost;
+        if (!initsubplan->parallel_safe)
+            unsafe_initplans = true;
+    }
+    *initplan_cost_p = initplan_cost;
+    *unsafe_initplans_p = unsafe_initplans;
+}
+
 /*
  * SS_attach_initplans - attach initplans to topmost plan node
  *
@@ -2967,6 +3012,7 @@ SS_make_initplan_from_plan(PlannerInfo *root,
                                node->plan_id, prm->paramid);
     get_first_col_type(plan, &node->firstColType, &node->firstColTypmod,
                        &node->firstColCollation);
+    node->parallel_safe = plan->parallel_safe;
     node->setParam = list_make1_int(prm->paramid);

     root->init_plans = lappend(root->init_plans, node);
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 5f5596841c..f123fcb41e 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -3348,8 +3348,7 @@ create_minmaxagg_path(PlannerInfo *root,
     /* For now, assume we are above any joins, so no parameterization */
     pathnode->path.param_info = NULL;
     pathnode->path.parallel_aware = false;
-    /* A MinMaxAggPath implies use of initplans, so cannot be parallel-safe */
-    pathnode->path.parallel_safe = false;
+    pathnode->path.parallel_safe = true;    /* might change below */
     pathnode->path.parallel_workers = 0;
     /* Result is one unordered row */
     pathnode->path.rows = 1;
@@ -3358,13 +3357,15 @@ create_minmaxagg_path(PlannerInfo *root,
     pathnode->mmaggregates = mmaggregates;
     pathnode->quals = quals;

-    /* Calculate cost of all the initplans ... */
+    /* Calculate cost of all the initplans, and check parallel safety */
     initplan_cost = 0;
     foreach(lc, mmaggregates)
     {
         MinMaxAggInfo *mminfo = (MinMaxAggInfo *) lfirst(lc);

         initplan_cost += mminfo->pathcost;
+        if (!mminfo->path->parallel_safe)
+            pathnode->path.parallel_safe = false;
     }

     /* add tlist eval cost for each output row, plus cpu_tuple_cost */
@@ -3385,6 +3386,17 @@ create_minmaxagg_path(PlannerInfo *root,
         pathnode->path.total_cost += qual_cost.startup + qual_cost.per_tuple;
     }

+    /*
+     * If the initplans were all parallel-safe, also check safety of the
+     * target and quals.  (The Result node itself isn't parallelizable, but if
+     * we are in a subquery then it can be useful for the outer query to know
+     * that this one is parallel-safe.)
+     */
+    if (pathnode->path.parallel_safe)
+        pathnode->path.parallel_safe =
+            is_parallel_safe(root, (Node *) target->exprs) &&
+            is_parallel_safe(root, (Node *) quals);
+
     return pathnode;
 }

diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index c03ffc56bf..44bc0bda7e 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -28,6 +28,9 @@ extern Node *SS_replace_correlation_vars(PlannerInfo *root, Node *expr);
 extern Node *SS_process_sublinks(PlannerInfo *root, Node *expr, bool isQual);
 extern void SS_identify_outer_params(PlannerInfo *root);
 extern void SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel);
+extern void SS_compute_initplan_cost(List *init_plans,
+                                     Cost *initplan_cost_p,
+                                     bool *unsafe_initplans_p);
 extern void SS_attach_initplans(PlannerInfo *root, Plan *plan);
 extern void SS_finalize_plan(PlannerInfo *root, Plan *plan);
 extern Param *SS_make_initplan_output_param(PlannerInfo *root,
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index d700c00629..f61fbb2809 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3698,30 +3698,25 @@ select explain_parallel_append(
 select * from listp where a = (select 2);');
                               explain_parallel_append
 -----------------------------------------------------------------------------------
- Append (actual rows=N loops=N)
-   ->  Gather (actual rows=N loops=N)
-         Workers Planned: 2
-         Params Evaluated: $0
-         Workers Launched: N
-         InitPlan 1 (returns $0)
-           ->  Result (actual rows=N loops=N)
-         ->  Parallel Append (actual rows=N loops=N)
-               ->  Seq Scan on listp_12_1 listp_1 (actual rows=N loops=N)
-                     Filter: (a = $0)
-               ->  Parallel Seq Scan on listp_12_2 listp_2 (never executed)
-                     Filter: (a = $0)
-   ->  Gather (actual rows=N loops=N)
-         Workers Planned: 2
-         Params Evaluated: $1
-         Workers Launched: N
-         InitPlan 2 (returns $1)
-           ->  Result (actual rows=N loops=N)
+ Gather (actual rows=N loops=N)
+   Workers Planned: 2
+   Workers Launched: N
+   ->  Parallel Append (actual rows=N loops=N)
          ->  Parallel Append (actual rows=N loops=N)
-               ->  Seq Scan on listp_12_1 listp_4 (never executed)
+               InitPlan 2 (returns $1)
+                 ->  Result (actual rows=N loops=N)
+               ->  Seq Scan on listp_12_1 listp_1 (never executed)
                      Filter: (a = $1)
-               ->  Parallel Seq Scan on listp_12_2 listp_5 (actual rows=N loops=N)
+               ->  Parallel Seq Scan on listp_12_2 listp_2 (actual rows=N loops=N)
                      Filter: (a = $1)
-(23 rows)
+         ->  Parallel Append (actual rows=N loops=N)
+               InitPlan 1 (returns $0)
+                 ->  Result (actual rows=N loops=N)
+               ->  Seq Scan on listp_12_1 listp_4 (actual rows=N loops=N)
+                     Filter: (a = $0)
+               ->  Parallel Seq Scan on listp_12_2 listp_5 (never executed)
+                     Filter: (a = $0)
+(18 rows)

 drop table listp;
 reset parallel_tuple_cost;

Re: Allowing parallel-safe initplans

From

Robert Haas

Date:

12 April 2023, 18:06:05

On Wed, Apr 12, 2023 at 12:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Pursuant to the discussion at [1], here's a patch that removes our
> old restriction that a plan node having initPlans can't be marked
> parallel-safe (dating to commit ab77a5a45).  That was really a special
> case of the fact that we couldn't transmit subplans to parallel
> workers at all.  We fixed that in commit 5e6d8d2bb and follow-ons,
> but this case never got addressed.

Nice.

> Along the way, this also takes care of some sloppiness about updating
> path costs to match when we move initplans from one place to another
> during createplan.c and setrefs.c.  Since all the planning decisions are
> already made by that point, this is just cosmetic; but it seems good
> to keep EXPLAIN output consistent with where the initplans are.

OK. It would be nicer if we had a more principled approach here, but
that's a job for another day.

> There's only one existing test case that visibly changes plan with
> these changes.  The new plan is clearly saner-looking than before,
> and testing with some data loaded into the table confirms that it
> is faster.  I'm not sure if it's worth devising more test cases.

It seems like it would be nice to see one or two additional scenarios
where these changes bring a benefit, with different kinds of plan
shapes.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Allowing parallel-safe initplans

From

Richard Guo

Date:

13 April 2023, 08:23:06

On Thu, Apr 13, 2023 at 12:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pursuant to the discussion at [1], here's a patch that removes our
old restriction that a plan node having initPlans can't be marked
parallel-safe (dating to commit ab77a5a45). That was really a special
case of the fact that we couldn't transmit subplans to parallel
workers at all. We fixed that in commit 5e6d8d2bb and follow-ons,
but this case never got addressed.

The patch looks good to me. Some comments from me:

* For the diff in standard_planner, I was wondering why not move the
initPlans up to the Gather node, just as we did before. So I tried that
way but did not notice the breakage of regression tests as stated in the
comments. Would you please confirm that?

* Not related to this patch. In SS_make_initplan_from_plan, the comment
says that the node's parParam and args lists remain empty. I wonder if
we need to explicitly set node->parParam and node->args to NIL before
that comment, or can we depend on makeNode to initialize them to NIL?

There's only one existing test case that visibly changes plan with
these changes. The new plan is clearly saner-looking than before,
and testing with some data loaded into the table confirms that it
is faster. I'm not sure if it's worth devising more test cases.

I also think it's better to have more test cases covering this change.

Thanks
Richard

Re: Allowing parallel-safe initplans

From

Tom Lane

Date:

13 April 2023, 14:00:51

Richard Guo <guofenglinux@gmail.com> writes:
> * For the diff in standard_planner, I was wondering why not move the
> initPlans up to the Gather node, just as we did before.  So I tried that
> way but did not notice the breakage of regression tests as stated in the
> comments.  Would you please confirm that?

Try it with debug_parallel_query = regress.

> * Not related to this patch.  In SS_make_initplan_from_plan, the comment
> says that the node's parParam and args lists remain empty.  I wonder if
> we need to explicitly set node->parParam and node->args to NIL before
> that comment, or can we depend on makeNode to initialize them to NIL?

I'm generally a fan of explicitly initializing fields, but the basic
argument for that is greppability.  That comment serves the purpose,
so I don't feel a big need to change it.

            regards, tom lane

Re: Allowing parallel-safe initplans

From

Richard Guo

Date:

17 April 2023, 02:57:01

On Thu, Apr 13, 2023 at 10:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:
> * For the diff in standard_planner, I was wondering why not move the
> initPlans up to the Gather node, just as we did before. So I tried that
> way but did not notice the breakage of regression tests as stated in the
> comments. Would you please confirm that?

Try it with debug_parallel_query = regress.

Ah, I see. With DEBUG_PARALLEL_REGRESS the initPlans that move to the
Gather would become invisible along with the Gather node.

As I tried this, I found that the breakage caused by moving the
initPlans to the Gather node might be more than just being cosmetic.
Sometimes it may cause wrong results. As an example, consider

create table a (i int, j int);
insert into a values (1, 1);
create index on a(i, j);

set enable_seqscan to off;
set debug_parallel_query to on;

# select min(i) from a;
min
-----
0
(1 row)

As we can see, the result is not correct. And the plan looks like

# explain (verbose, costs off) select min(i) from a;
QUERY PLAN
-----------------------------------------------------------
Gather
Output: ($0)
Workers Planned: 1
Single Copy: true
InitPlan 1 (returns $0)
-> Limit
Output: a.i
-> Index Only Scan using a_i_j_idx on public.a
Output: a.i
Index Cond: (a.i IS NOT NULL)
-> Result
Output: $0
(12 rows)

The initPlan has been moved from the Result node to the Gather node. As
a result, when doing tuple projection for the Result node, we'd get a
ParamExecData entry with NULL execPlan. So the initPlan does not get
chance to be executed. And we'd get the output as the default value
from the ParamExecData entry, which is zero as shown.

So now I begin to wonder if this wrong result issue is possible to exist
in other places where we move initPlans. But I haven't tried hard to
verify that.

Thanks
Richard

Re: Allowing parallel-safe initplans

From

Richard Guo

Date:

17 April 2023, 07:26:46

On Mon, Apr 17, 2023 at 10:57 AM Richard Guo <guofenglinux@gmail.com> wrote:

The initPlan has been moved from the Result node to the Gather node. As
a result, when doing tuple projection for the Result node, we'd get a
ParamExecData entry with NULL execPlan. So the initPlan does not get
chance to be executed. And we'd get the output as the default value
from the ParamExecData entry, which is zero as shown.

So now I begin to wonder if this wrong result issue is possible to exist
in other places where we move initPlans. But I haven't tried hard to
verify that.

I looked further into this issue and I believe other places are good.
The problem with this query is that the es/ecxt_param_exec_vals used to
store info about the initplan is not the same one as in the Result
node's expression context for projection, because we've forked a new
process for the parallel worker and then created and initialized a new
EState node, and allocated a new es_param_exec_vals array for the new
EState. When doing projection for the Result node, the current code
just goes ahead and accesses the new es_param_exec_vals, thus fails to
retrieve the info about the initplan. Hmm, I doubt this is sensible.

So now it seems that the breakage of regression tests is more severe
than being cosmetic. I wonder if we need to update the comments to
indicate the potential wrong results issue if we move the initPlans to
the Gather node.

Thanks
Richard

Re: Allowing parallel-safe initplans

From

Tom Lane

Date:

17 April 2023, 15:04:33

Richard Guo <guofenglinux@gmail.com> writes:
> So now it seems that the breakage of regression tests is more severe
> than being cosmetic.  I wonder if we need to update the comments to
> indicate the potential wrong results issue if we move the initPlans to
> the Gather node.

I wondered about that too, but how come neither of us saw non-cosmetic
failures (ie, actual query output changes not just EXPLAIN changes)
when we tried this?  Maybe the case is somehow not exercised, but if
so I'm more worried about adding regression tests than comments.

I think actually that it does work beyond the EXPLAIN weirdness,
because since e89a71fb4 the Gather machinery knows how to transmit
the values of Params listed in Gather.initParam to workers, and that
is filled in setrefs.c in a way that looks like it'd work regardless
of whether the Gather appeared organically or was stuck on by the
debug_parallel_query hackery.  I've not tried to verify that
directly though.

            regards, tom lane

Re: Allowing parallel-safe initplans

From

Richard Guo

Date:

18 April 2023, 07:14:01

On Mon, Apr 17, 2023 at 11:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:
> So now it seems that the breakage of regression tests is more severe
> than being cosmetic. I wonder if we need to update the comments to
> indicate the potential wrong results issue if we move the initPlans to
> the Gather node.

I wondered about that too, but how come neither of us saw non-cosmetic
failures (ie, actual query output changes not just EXPLAIN changes)
when we tried this? Maybe the case is somehow not exercised, but if
so I'm more worried about adding regression tests than comments.

Sorry I forgot to mention that I did see query output changes after
moving the initPlans to the Gather node. First of all let me make sure
I was doing it in the right way. On the base of the patch, I was using
the diff as below

if (debug_parallel_query != DEBUG_PARALLEL_OFF &&
- top_plan->parallel_safe && top_plan->initPlan == NIL)
+ top_plan->parallel_safe)
{
Gather *gather = makeNode(Gather);

+ gather->plan.initPlan = top_plan->initPlan;
+ top_plan->initPlan = NIL;
+
gather->plan.targetlist = top_plan->targetlist;

And then I changed the default value of debug_parallel_query to
DEBUG_PARALLEL_REGRESS. And then I just ran 'make installcheck' and saw
the query output changes.

I think actually that it does work beyond the EXPLAIN weirdness,
because since e89a71fb4 the Gather machinery knows how to transmit
the values of Params listed in Gather.initParam to workers, and that
is filled in setrefs.c in a way that looks like it'd work regardless
of whether the Gather appeared organically or was stuck on by the
debug_parallel_query hackery. I've not tried to verify that
directly though.

It seems that in this case the top_plan does not have any extParam, so
the Gather node that is added atop the top_plan does not have a chance
to get its initParam filled in set_param_references().

Thanks
Richard

Re: Allowing parallel-safe initplans

From

Tom Lane

Date:

18 April 2023, 13:33:57

Richard Guo <guofenglinux@gmail.com> writes:
> On Mon, Apr 17, 2023 at 11:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I wondered about that too, but how come neither of us saw non-cosmetic
>> failures (ie, actual query output changes not just EXPLAIN changes)
>> when we tried this?

> Sorry I forgot to mention that I did see query output changes after
> moving the initPlans to the Gather node.

Hmm, my memory was just of seeing the EXPLAIN output changes, but
maybe those got my attention to the extent of missing the others.

> It seems that in this case the top_plan does not have any extParam, so
> the Gather node that is added atop the top_plan does not have a chance
> to get its initParam filled in set_param_references().

Oh, so maybe we'd need to copy up extParam as well?  But it's largely
moot, since I don't see a good way to avoid breaking the EXPLAIN
output.

            regards, tom lane

Re: Allowing parallel-safe initplans

From

Richard Guo

Date:

19 April 2023, 02:42:08

On Tue, Apr 18, 2023 at 9:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:
> It seems that in this case the top_plan does not have any extParam, so
> the Gather node that is added atop the top_plan does not have a chance
> to get its initParam filled in set_param_references().

Oh, so maybe we'd need to copy up extParam as well? But it's largely
moot, since I don't see a good way to avoid breaking the EXPLAIN
output.

Yeah, seems breaking the EXPLAIN output is inevitable if we move the
initPlans to the Gather node. So maybe we need to keep the logic as in
v1 patch, i.e. avoid adding a Gather node when top_plan has initPlans.
If we do so, I wonder if we need to explain the potential wrong results
issue in the comments.

Thanks
Richard

Re: Allowing parallel-safe initplans

From

Tom Lane

Date:

10 July 2023, 21:36:43

I wrote:
> Richard Guo <guofenglinux@gmail.com> writes:
>> On Mon, Apr 17, 2023 at 11:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I wondered about that too, but how come neither of us saw non-cosmetic
>>> failures (ie, actual query output changes not just EXPLAIN changes)
>>> when we tried this?

>> Sorry I forgot to mention that I did see query output changes after
>> moving the initPlans to the Gather node.

> Hmm, my memory was just of seeing the EXPLAIN output changes, but
> maybe those got my attention to the extent of missing the others.

I got around to trying this, and you are right, there are some wrong
query answers as well as EXPLAIN output changes.  This mystified me
for awhile, because it sure looks like e89a71fb4 should have made it
work.

>> It seems that in this case the top_plan does not have any extParam, so
>> the Gather node that is added atop the top_plan does not have a chance
>> to get its initParam filled in set_param_references().

Eventually I noticed that all the failing cases were instances of
optimizing MIN()/MAX() aggregates into indexscans, and then I figured
out what the problem is: we substitute Params for the optimized-away
Aggref nodes in setrefs.c, *after* SS_finalize_plan has been run.
That means we fail to account for those Params in extParam/allParam
sets.  We've gotten away with that up to now because such Params
could only appear where Aggrefs can appear, which is only in top-level
(above scans and joins) nodes, which generally don't have any of the
sorts of rescan optimizations that extParam/allParam bits control.
But this patch results in needing to have a correct extParam set for
the node just below Gather, and we don't.  I am not sure whether there
are any reachable bugs without this patch; but there might be, or some
future optimization might introduce one.

It seems like the cleanest fix for this is to replace such optimized
Aggrefs in a separate tree scan before running SS_finalize_plan.
That's fairly annoying from a planner-runtime standpoint, although
we could skip the extra pass in the typical case where no minmax aggs
have been optimized.

I also thought about swapping the order of operations so that we
run SS_finalize_plan after setrefs.c.  That falls down because of
set_param_references itself, which requires those bits to be
calculated already.  But maybe we could integrate that computation
into SS_finalize_plan instead?  There's certainly nothing very
pretty about the way it's done now.

A band-aid fix that seemed to work is to have set_param_references
consult the Gather's own allParam set instead of the extParam set
of its child.  That feels like a kluge though, and it would not
help matters for any future bug involving another usage of those
bitmapsets.

BTW, there is another way in which setrefs.c can inject PARAM_EXEC
Params: it can translate PARAM_MULTIEXPR Params into those.  So
those won't be accounted for either.  I think this is probably
not a problem, especially not after 87f3667ec got us out of the
business of treating those like initPlan outputs.  But it does
seem like "you can't inject PARAM_EXEC Params during setrefs.c"
would not be a workable coding rule; it's too tempting to do
exactly that.

So at this point my inclination is to try to move SS_finalize_plan
to run after setrefs.c, but I've not written any code yet.  I'm
not sure if we'd need to back-patch that, but it at least seems
like important future-proofing.

None of this would lead me to want to move initPlans to
Gather nodes injected by debug_parallel_query, though.
We'd have to kluge something to keep the EXPLAIN output
looking the same, and that seems like a kluge too many.
What I am wondering is if the issue is reachable for
Gather nodes that are built organically by the regular
planner paths.  It seems like that might be the case,
either now or after applying this patch.

            regards, tom lane

Re: Allowing parallel-safe initplans

From

Tom Lane

Date:

13 July 2023, 21:44:19

I wrote:
> Eventually I noticed that all the failing cases were instances of
> optimizing MIN()/MAX() aggregates into indexscans, and then I figured
> out what the problem is: we substitute Params for the optimized-away
> Aggref nodes in setrefs.c, *after* SS_finalize_plan has been run.
> That means we fail to account for those Params in extParam/allParam
> sets.  We've gotten away with that up to now because such Params
> could only appear where Aggrefs can appear, which is only in top-level
> (above scans and joins) nodes, which generally don't have any of the
> sorts of rescan optimizations that extParam/allParam bits control.
> But this patch results in needing to have a correct extParam set for
> the node just below Gather, and we don't.  I am not sure whether there
> are any reachable bugs without this patch; but there might be, or some
> future optimization might introduce one.

> It seems like the cleanest fix for this is to replace such optimized
> Aggrefs in a separate tree scan before running SS_finalize_plan.
> That's fairly annoying from a planner-runtime standpoint, although
> we could skip the extra pass in the typical case where no minmax aggs
> have been optimized.
> I also thought about swapping the order of operations so that we
> run SS_finalize_plan after setrefs.c.  That falls down because of
> set_param_references itself, which requires those bits to be
> calculated already.  But maybe we could integrate that computation
> into SS_finalize_plan instead?  There's certainly nothing very
> pretty about the way it's done now.

I tried both of those and concluded they'd be too messy for a patch
that we might find ourselves having to back-patch.  So 0001 attached
fixes it by teaching SS_finalize_plan to treat optimized MIN()/MAX()
aggregates as if they were already Params.  It's slightly annoying
to have knowledge of that optimization metastasizing into another
place, but the alternatives are even less palatable.

Having done that, if you adjust 0002 to inject Gathers even when
debug_parallel_query = regress, the only diffs in the core regression
tests are that some initPlans disappear from EXPLAIN output.  The
outputs of the actual queries are still correct, demonstrating that
e89a71fb4 does indeed make it work as long as the param bitmapsets
are correct.

I'm still resistant to the idea of kluging EXPLAIN to the extent
of hiding the EXPLAIN output changes.  It wouldn't be that hard
to do really, but I worry that such a kluge might hide real problems
in future.  So what I did in 0002 was to allow initPlans for an
injected Gather only if debug_parallel_query = on, so that there
will be a place for EXPLAIN to show them.  Other than the changes
in that area, 0002 is the same as the previous patch.

            regards, tom lane

From c0ae4618308f89de57b4a7d744cf2f734555d4a2 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 13 Jul 2023 16:50:13 -0400
Subject: [PATCH v2 1/2] Account for optimized MinMax aggregates during
 SS_finalize_plan.

We are capable of optimizing MIN() and MAX() aggregates on indexed
columns into subqueries that exploit the index, rather than the normal
thing of scanning the whole table.  When we do this, we replace the
Aggref node(s) with Params referencing subquery outputs.  Such Params
really ought to be included in the per-plan-node extParam/allParam
sets computed by SS_finalize_plan.  However, we've never done so
up to now because of an ancient implementation choice to perform
that substitution during set_plan_references, which runs after
SS_finalize_plan, so that SS_finalize_plan never sees these Params.

This seems like clearly a bug, yet there have been no field reports
of problems that could trace to it.  This may be because the types
of Plan nodes that could contain Aggrefs do not have any of the
rescan optimizations that are controlled by extParam/allParam.
Nonetheless it seems certain to bite us someday, so let's fix it
in a self-contained patch that can be back-patched if we find a
case in which there's a live bug pre-v17.

The cleanest fix would be to perform a separate tree walk to do
these substitutions before SS_finalize_plan runs.  That seems
unattractive, first because a whole-tree mutation pass is expensive,
and second because we lack infrastructure for visiting expression
subtrees in a Plan tree, so that we'd need a new function knowing
as much as SS_finalize_plan knows about that.  I also considered
swapping the order of SS_finalize_plan and set_plan_references,
but that fell foul of various assumptions that seem tricky to fix.
So the approach adopted here is to teach SS_finalize_plan itself
to check for such Aggrefs.  I refactored things a bit in setrefs.c
to avoid having three copies of the code that does that.

Given the lack of any currently-known bug, no test case here.

Discussion: https://postgr.es/m/2391880.1689025003@sss.pgh.pa.us
---
 src/backend/optimizer/plan/setrefs.c   | 68 ++++++++++++++++----------
 src/backend/optimizer/plan/subselect.c | 25 ++++++++--
 src/include/optimizer/planmain.h       |  2 +
 3 files changed, 66 insertions(+), 29 deletions(-)

diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index c63758cb2b..16e5537f7f 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2181,22 +2181,14 @@ fix_scan_expr_mutator(Node *node, fix_scan_expr_context *context)
     if (IsA(node, Aggref))
     {
         Aggref       *aggref = (Aggref *) node;
+        Param       *aggparam;

         /* See if the Aggref should be replaced by a Param */
-        if (context->root->minmax_aggs != NIL &&
-            list_length(aggref->args) == 1)
+        aggparam = find_minmax_agg_replacement_param(context->root, aggref);
+        if (aggparam != NULL)
         {
-            TargetEntry *curTarget = (TargetEntry *) linitial(aggref->args);
-            ListCell   *lc;
-
-            foreach(lc, context->root->minmax_aggs)
-            {
-                MinMaxAggInfo *mminfo = (MinMaxAggInfo *) lfirst(lc);
-
-                if (mminfo->aggfnoid == aggref->aggfnoid &&
-                    equal(mminfo->target, curTarget->expr))
-                    return (Node *) copyObject(mminfo->param);
-            }
+            /* Make a copy of the Param for paranoia's sake */
+            return (Node *) copyObject(aggparam);
         }
         /* If no match, just fall through to process it normally */
     }
@@ -3225,22 +3217,14 @@ fix_upper_expr_mutator(Node *node, fix_upper_expr_context *context)
     if (IsA(node, Aggref))
     {
         Aggref       *aggref = (Aggref *) node;
+        Param       *aggparam;

         /* See if the Aggref should be replaced by a Param */
-        if (context->root->minmax_aggs != NIL &&
-            list_length(aggref->args) == 1)
+        aggparam = find_minmax_agg_replacement_param(context->root, aggref);
+        if (aggparam != NULL)
         {
-            TargetEntry *curTarget = (TargetEntry *) linitial(aggref->args);
-            ListCell   *lc;
-
-            foreach(lc, context->root->minmax_aggs)
-            {
-                MinMaxAggInfo *mminfo = (MinMaxAggInfo *) lfirst(lc);
-
-                if (mminfo->aggfnoid == aggref->aggfnoid &&
-                    equal(mminfo->target, curTarget->expr))
-                    return (Node *) copyObject(mminfo->param);
-            }
+            /* Make a copy of the Param for paranoia's sake */
+            return (Node *) copyObject(aggparam);
         }
         /* If no match, just fall through to process it normally */
     }
@@ -3395,6 +3379,38 @@ set_windowagg_runcondition_references(PlannerInfo *root,
     return newlist;
 }

+/*
+ * find_minmax_agg_replacement_param
+ *        If the given Aggref is one that we are optimizing into a subquery
+ *        (cf. planagg.c), then return the Param that should replace it.
+ *        Else return NULL.
+ *
+ * This is exported so that SS_finalize_plan can use it before setrefs.c runs.
+ * Note that it will not find anything until we have built a Plan from a
+ * MinMaxAggPath, as root->minmax_aggs will never be filled otherwise.
+ */
+Param *
+find_minmax_agg_replacement_param(PlannerInfo *root, Aggref *aggref)
+{
+    if (root->minmax_aggs != NIL &&
+        list_length(aggref->args) == 1)
+    {
+        TargetEntry *curTarget = (TargetEntry *) linitial(aggref->args);
+        ListCell   *lc;
+
+        foreach(lc, root->minmax_aggs)
+        {
+            MinMaxAggInfo *mminfo = (MinMaxAggInfo *) lfirst(lc);
+
+            if (mminfo->aggfnoid == aggref->aggfnoid &&
+                equal(mminfo->target, curTarget->expr))
+                return mminfo->param;
+        }
+    }
+    return NULL;
+}
+
+
 /*****************************************************************************
  *                    QUERY DEPENDENCY MANAGEMENT
  *****************************************************************************/
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index da2d8abe01..250ba8edcb 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2835,8 +2835,8 @@ finalize_plan(PlannerInfo *root, Plan *plan,
 }

 /*
- * finalize_primnode: add IDs of all PARAM_EXEC params appearing in the given
- * expression tree to the result set.
+ * finalize_primnode: add IDs of all PARAM_EXEC params that appear (or will
+ * appear) in the given expression tree to the result set.
  */
 static bool
 finalize_primnode(Node *node, finalize_primnode_context *context)
@@ -2853,7 +2853,26 @@ finalize_primnode(Node *node, finalize_primnode_context *context)
         }
         return false;            /* no more to do here */
     }
-    if (IsA(node, SubPlan))
+    else if (IsA(node, Aggref))
+    {
+        /*
+         * Check to see if the aggregate will be replaced by a Param
+         * referencing a subquery output during setrefs.c.  If so, we must
+         * account for that Param here.  (For various reasons, it's not
+         * convenient to perform that substitution earlier than setrefs.c, nor
+         * to perform this processing after setrefs.c.  Thus we need a wart
+         * here.)
+         */
+        Aggref       *aggref = (Aggref *) node;
+        Param       *aggparam;
+
+        aggparam = find_minmax_agg_replacement_param(context->root, aggref);
+        if (aggparam != NULL)
+            context->paramids = bms_add_member(context->paramids,
+                                               aggparam->paramid);
+        /* Fall through to examine the agg's arguments */
+    }
+    else if (IsA(node, SubPlan))
     {
         SubPlan    *subplan = (SubPlan *) node;
         Plan       *plan = planner_subplan_get_plan(context->root, subplan);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 5fc900737d..31c188176b 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -110,6 +110,8 @@ extern bool innerrel_is_unique(PlannerInfo *root,
  */
 extern Plan *set_plan_references(PlannerInfo *root, Plan *plan);
 extern bool trivial_subqueryscan(SubqueryScan *plan);
+extern Param *find_minmax_agg_replacement_param(PlannerInfo *root,
+                                                Aggref *aggref);
 extern void record_plan_function_dependency(PlannerInfo *root, Oid funcid);
 extern void record_plan_type_dependency(PlannerInfo *root, Oid typid);
 extern bool extract_query_dependencies_walker(Node *node, PlannerInfo *context);
--
2.39.3

From fa74d0a664e56aee98b9737779d3768deb2cfb7f Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 13 Jul 2023 17:30:14 -0400
Subject: [PATCH v2 2/2] Allow plan nodes with initPlans to be considered
 parallel-safe.

If the plan itself is parallel-safe, and the initPlans are too,
there's no reason anymore to prevent the plan from being marked
parallel-safe.  That restriction (dating to commit ab77a5a45) was
really a special case of the fact that we couldn't transmit subplans
to parallel workers at all.  We fixed that in commit 5e6d8d2bb and
follow-ons, but this case never got addressed.

We still forbid attaching initPlans to a Gather node that's
inserted pursuant to debug_parallel_query = regress.  That's because,
when we hide the Gather from EXPLAIN output, we'd hide the initPlans
too, causing cosmetic regression diffs.  It seems inadvisable to
kluge EXPLAIN to the extent required to make the output look the
same, so just don't do it in that case.

Along the way, this also takes care of some sloppiness about updating
path costs to match when we move initplans from one place to another
during createplan.c and setrefs.c.  Since all the planning decisions
are already made by that point, this is just cosmetic; but it seems
good to keep EXPLAIN output consistent with where the initplans are.

The diff in query_planner() might be worth remarking on.  I found that
one because after fixing things to allow parallel-safe initplans, one
partition_prune test case changed plans (as shown in the patch) ---
but only when debug_parallel_query was active.  The reason proved to
be that we only bothered to mark Result nodes as potentially
parallel-safe when debug_parallel_query is on.  This neglects the fact
that parallel-safety may be of interest for a sub-query even though
the Result itself doesn't parallelize.

Discussion: https://postgr.es/m/1129530.1681317832@sss.pgh.pa.us
---
 src/backend/optimizer/plan/createplan.c       | 15 +++-
 src/backend/optimizer/plan/planmain.c         | 15 ++--
 src/backend/optimizer/plan/planner.c          | 36 ++++++--
 src/backend/optimizer/plan/setrefs.c          | 32 ++++---
 src/backend/optimizer/plan/subselect.c        | 84 ++++++++++++++-----
 src/backend/optimizer/util/pathnode.c         | 18 +++-
 src/include/optimizer/subselect.h             |  3 +
 src/test/regress/expected/partition_prune.out | 37 ++++----
 8 files changed, 169 insertions(+), 71 deletions(-)

diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ec73789bc2..af48109058 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -6481,6 +6481,8 @@ materialize_finished_plan(Plan *subplan)
 {
     Plan       *matplan;
     Path        matpath;        /* dummy for result of cost_material */
+    Cost        initplan_cost;
+    bool        unsafe_initplans;

     matplan = (Plan *) make_material(subplan);

@@ -6488,20 +6490,25 @@ materialize_finished_plan(Plan *subplan)
      * XXX horrid kluge: if there are any initPlans attached to the subplan,
      * move them up to the Material node, which is now effectively the top
      * plan node in its query level.  This prevents failure in
-     * SS_finalize_plan(), which see for comments.  We don't bother adjusting
-     * the subplan's cost estimate for this.
+     * SS_finalize_plan(), which see for comments.
      */
     matplan->initPlan = subplan->initPlan;
     subplan->initPlan = NIL;

+    /* Move the initplans' cost delta, as well */
+    SS_compute_initplan_cost(matplan->initPlan,
+                             &initplan_cost, &unsafe_initplans);
+    subplan->startup_cost -= initplan_cost;
+    subplan->total_cost -= initplan_cost;
+
     /* Set cost data */
     cost_material(&matpath,
                   subplan->startup_cost,
                   subplan->total_cost,
                   subplan->plan_rows,
                   subplan->plan_width);
-    matplan->startup_cost = matpath.startup_cost;
-    matplan->total_cost = matpath.total_cost;
+    matplan->startup_cost = matpath.startup_cost + initplan_cost;
+    matplan->total_cost = matpath.total_cost + initplan_cost;
     matplan->plan_rows = subplan->plan_rows;
     matplan->plan_width = subplan->plan_width;
     matplan->parallel_aware = false;
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 7afd434c60..fcc0eacd25 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -112,14 +112,17 @@ query_planner(PlannerInfo *root,
                  * quals are parallel-restricted.  (We need not check
                  * final_rel->reltarget because it's empty at this point.
                  * Anything parallel-restricted in the query tlist will be
-                 * dealt with later.)  This is normally pretty silly, because
-                 * a Result-only plan would never be interesting to
-                 * parallelize.  However, if debug_parallel_query is on, then
-                 * we want to execute the Result in a parallel worker if
-                 * possible, so we must do this.
+                 * dealt with later.)  We should always do this in a subquery,
+                 * since it might be useful to use the subquery in parallel
+                 * paths in the parent level.  At top level this is normally
+                 * not worth the cycles, because a Result-only plan would
+                 * never be interesting to parallelize.  However, if
+                 * debug_parallel_query is on, then we want to execute the
+                 * Result in a parallel worker if possible, so we must check.
                  */
                 if (root->glob->parallelModeOK &&
-                    debug_parallel_query != DEBUG_PARALLEL_OFF)
+                    (root->query_level > 1 ||
+                     debug_parallel_query != DEBUG_PARALLEL_OFF))
                     final_rel->consider_parallel =
                         is_parallel_safe(root, parse->jointree->quals);

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 0e12fdeb60..44efb1f4eb 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -432,16 +432,23 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
     /*
      * Optionally add a Gather node for testing purposes, provided this is
      * actually a safe thing to do.
-     */
-    if (debug_parallel_query != DEBUG_PARALLEL_OFF && top_plan->parallel_safe)
+     *
+     * We can add Gather even when top_plan has parallel-safe initPlans, but
+     * then we have to move the initPlans to the Gather node because of
+     * SS_finalize_plan's limitations.  That would cause cosmetic breakage of
+     * regression tests when debug_parallel_query = regress, because initPlans
+     * that would normally appear on the top_plan move to the Gather, causing
+     * them to disappear from EXPLAIN output.  That doesn't seem worth kluging
+     * EXPLAIN to hide, so skip it when debug_parallel_query = regress.
+     */
+    if (debug_parallel_query != DEBUG_PARALLEL_OFF &&
+        top_plan->parallel_safe &&
+        (top_plan->initPlan == NIL ||
+         debug_parallel_query != DEBUG_PARALLEL_REGRESS))
     {
         Gather       *gather = makeNode(Gather);
-
-        /*
-         * Top plan must not have any initPlans, else it shouldn't have been
-         * marked parallel-safe.
-         */
-        Assert(top_plan->initPlan == NIL);
+        Cost        initplan_cost;
+        bool        unsafe_initplans;

         gather->plan.targetlist = top_plan->targetlist;
         gather->plan.qual = NIL;
@@ -451,6 +458,10 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
         gather->single_copy = true;
         gather->invisible = (debug_parallel_query == DEBUG_PARALLEL_REGRESS);

+        /* Transfer any initPlans to the new top node */
+        gather->plan.initPlan = top_plan->initPlan;
+        top_plan->initPlan = NIL;
+
         /*
          * Since this Gather has no parallel-aware descendants to signal to,
          * we don't need a rescan Param.
@@ -470,6 +481,15 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
         gather->plan.parallel_aware = false;
         gather->plan.parallel_safe = false;

+        /*
+         * Delete the initplans' cost from top_plan.  We needn't add it to the
+         * Gather node, since the above coding already included it there.
+         */
+        SS_compute_initplan_cost(gather->plan.initPlan,
+                                 &initplan_cost, &unsafe_initplans);
+        top_plan->startup_cost -= initplan_cost;
+        top_plan->total_cost -= initplan_cost;
+
         /* use parallel mode for parallel plans. */
         root->glob->parallelModeNeeded = true;

diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 16e5537f7f..97fa561e4e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -23,6 +23,7 @@
 #include "optimizer/pathnode.h"
 #include "optimizer/planmain.h"
 #include "optimizer/planner.h"
+#include "optimizer/subselect.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_relation.h"
 #include "tcop/utility.h"
@@ -1519,19 +1520,30 @@ clean_up_removed_plan_level(Plan *parent, Plan *child)
 {
     /*
      * We have to be sure we don't lose any initplans, so move any that were
-     * attached to the parent plan to the child.  If we do move any, the child
-     * is no longer parallel-safe.
+     * attached to the parent plan to the child.  If any are parallel-unsafe,
+     * the child is no longer parallel-safe.  As a cosmetic matter, also add
+     * the initplans' run costs to the child's costs.
      */
     if (parent->initPlan)
-        child->parallel_safe = false;
+    {
+        Cost        initplan_cost;
+        bool        unsafe_initplans;

-    /*
-     * Attach plans this way so that parent's initplans are processed before
-     * any pre-existing initplans of the child.  Probably doesn't matter, but
-     * let's preserve the ordering just in case.
-     */
-    child->initPlan = list_concat(parent->initPlan,
-                                  child->initPlan);
+        SS_compute_initplan_cost(parent->initPlan,
+                                 &initplan_cost, &unsafe_initplans);
+        child->startup_cost += initplan_cost;
+        child->total_cost += initplan_cost;
+        if (unsafe_initplans)
+            child->parallel_safe = false;
+
+        /*
+         * Attach plans this way so that parent's initplans are processed
+         * before any pre-existing initplans of the child.  Probably doesn't
+         * matter, but let's preserve the ordering just in case.
+         */
+        child->initPlan = list_concat(parent->initPlan,
+                                      child->initPlan);
+    }

     /*
      * We also have to transfer the parent's column labeling info into the
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 250ba8edcb..7a9fe88fec 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -1016,8 +1016,7 @@ SS_process_ctes(PlannerInfo *root)

         /*
          * CTE scans are not considered for parallelism (cf
-         * set_rel_consider_parallel), and even if they were, initPlans aren't
-         * parallel-safe.
+         * set_rel_consider_parallel).
          */
         splan->parallel_safe = false;
         splan->setParam = NIL;
@@ -2120,8 +2119,8 @@ SS_identify_outer_params(PlannerInfo *root)
  * If any initPlans have been created in the current query level, they will
  * get attached to the Plan tree created from whichever Path we select from
  * the given rel.  Increment all that rel's Paths' costs to account for them,
- * and make sure the paths get marked as parallel-unsafe, since we can't
- * currently transmit initPlans to parallel workers.
+ * and if any of the initPlans are parallel-unsafe, mark all the rel's Paths
+ * parallel-unsafe as well.
  *
  * This is separate from SS_attach_initplans because we might conditionally
  * create more initPlans during create_plan(), depending on which Path we
@@ -2132,6 +2131,7 @@ void
 SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)
 {
     Cost        initplan_cost;
+    bool        unsafe_initplans;
     ListCell   *lc;

     /* Nothing to do if no initPlans */
@@ -2140,17 +2140,10 @@ SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)

     /*
      * Compute the cost increment just once, since it will be the same for all
-     * Paths.  We assume each initPlan gets run once during top plan startup.
-     * This is a conservative overestimate, since in fact an initPlan might be
-     * executed later than plan startup, or even not at all.
+     * Paths.  Also check for parallel-unsafe initPlans.
      */
-    initplan_cost = 0;
-    foreach(lc, root->init_plans)
-    {
-        SubPlan    *initsubplan = (SubPlan *) lfirst(lc);
-
-        initplan_cost += initsubplan->startup_cost + initsubplan->per_call_cost;
-    }
+    SS_compute_initplan_cost(root->init_plans,
+                             &initplan_cost, &unsafe_initplans);

     /*
      * Now adjust the costs and parallel_safe flags.
@@ -2161,19 +2154,71 @@ SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)

         path->startup_cost += initplan_cost;
         path->total_cost += initplan_cost;
-        path->parallel_safe = false;
+        if (unsafe_initplans)
+            path->parallel_safe = false;
     }

     /*
-     * Forget about any partial paths and clear consider_parallel, too;
-     * they're not usable if we attached an initPlan.
+     * Adjust partial paths' costs too, or forget them entirely if we must
+     * consider the rel parallel-unsafe.
      */
-    final_rel->partial_pathlist = NIL;
-    final_rel->consider_parallel = false;
+    if (unsafe_initplans)
+    {
+        final_rel->partial_pathlist = NIL;
+        final_rel->consider_parallel = false;
+    }
+    else
+    {
+        foreach(lc, final_rel->partial_pathlist)
+        {
+            Path       *path = (Path *) lfirst(lc);
+
+            path->startup_cost += initplan_cost;
+            path->total_cost += initplan_cost;
+        }
+    }

     /* We needn't do set_cheapest() here, caller will do it */
 }

+/*
+ * SS_compute_initplan_cost - count up the cost delta for some initplans
+ *
+ * The total cost returned in *initplan_cost_p should be added to both the
+ * startup and total costs of the plan node the initplans get attached to.
+ * We also report whether any of the initplans are not parallel-safe.
+ *
+ * The primary user of this is SS_charge_for_initplans, but it's also
+ * used in adjusting costs when we move initplans to another plan node.
+ */
+void
+SS_compute_initplan_cost(List *init_plans,
+                         Cost *initplan_cost_p,
+                         bool *unsafe_initplans_p)
+{
+    Cost        initplan_cost;
+    bool        unsafe_initplans;
+    ListCell   *lc;
+
+    /*
+     * We assume each initPlan gets run once during top plan startup.  This is
+     * a conservative overestimate, since in fact an initPlan might be
+     * executed later than plan startup, or even not at all.
+     */
+    initplan_cost = 0;
+    unsafe_initplans = false;
+    foreach(lc, init_plans)
+    {
+        SubPlan    *initsubplan = lfirst_node(SubPlan, lc);
+
+        initplan_cost += initsubplan->startup_cost + initsubplan->per_call_cost;
+        if (!initsubplan->parallel_safe)
+            unsafe_initplans = true;
+    }
+    *initplan_cost_p = initplan_cost;
+    *unsafe_initplans_p = unsafe_initplans;
+}
+
 /*
  * SS_attach_initplans - attach initplans to topmost plan node
  *
@@ -2990,6 +3035,7 @@ SS_make_initplan_from_plan(PlannerInfo *root,
                                node->plan_id, prm->paramid);
     get_first_col_type(plan, &node->firstColType, &node->firstColTypmod,
                        &node->firstColCollation);
+    node->parallel_safe = plan->parallel_safe;
     node->setParam = list_make1_int(prm->paramid);

     root->init_plans = lappend(root->init_plans, node);
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 5f5596841c..f123fcb41e 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -3348,8 +3348,7 @@ create_minmaxagg_path(PlannerInfo *root,
     /* For now, assume we are above any joins, so no parameterization */
     pathnode->path.param_info = NULL;
     pathnode->path.parallel_aware = false;
-    /* A MinMaxAggPath implies use of initplans, so cannot be parallel-safe */
-    pathnode->path.parallel_safe = false;
+    pathnode->path.parallel_safe = true;    /* might change below */
     pathnode->path.parallel_workers = 0;
     /* Result is one unordered row */
     pathnode->path.rows = 1;
@@ -3358,13 +3357,15 @@ create_minmaxagg_path(PlannerInfo *root,
     pathnode->mmaggregates = mmaggregates;
     pathnode->quals = quals;

-    /* Calculate cost of all the initplans ... */
+    /* Calculate cost of all the initplans, and check parallel safety */
     initplan_cost = 0;
     foreach(lc, mmaggregates)
     {
         MinMaxAggInfo *mminfo = (MinMaxAggInfo *) lfirst(lc);

         initplan_cost += mminfo->pathcost;
+        if (!mminfo->path->parallel_safe)
+            pathnode->path.parallel_safe = false;
     }

     /* add tlist eval cost for each output row, plus cpu_tuple_cost */
@@ -3385,6 +3386,17 @@ create_minmaxagg_path(PlannerInfo *root,
         pathnode->path.total_cost += qual_cost.startup + qual_cost.per_tuple;
     }

+    /*
+     * If the initplans were all parallel-safe, also check safety of the
+     * target and quals.  (The Result node itself isn't parallelizable, but if
+     * we are in a subquery then it can be useful for the outer query to know
+     * that this one is parallel-safe.)
+     */
+    if (pathnode->path.parallel_safe)
+        pathnode->path.parallel_safe =
+            is_parallel_safe(root, (Node *) target->exprs) &&
+            is_parallel_safe(root, (Node *) quals);
+
     return pathnode;
 }

diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index c03ffc56bf..44bc0bda7e 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -28,6 +28,9 @@ extern Node *SS_replace_correlation_vars(PlannerInfo *root, Node *expr);
 extern Node *SS_process_sublinks(PlannerInfo *root, Node *expr, bool isQual);
 extern void SS_identify_outer_params(PlannerInfo *root);
 extern void SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel);
+extern void SS_compute_initplan_cost(List *init_plans,
+                                     Cost *initplan_cost_p,
+                                     bool *unsafe_initplans_p);
 extern void SS_attach_initplans(PlannerInfo *root, Plan *plan);
 extern void SS_finalize_plan(PlannerInfo *root, Plan *plan);
 extern Param *SS_make_initplan_output_param(PlannerInfo *root,
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 2abf759385..1eb347503a 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3903,30 +3903,25 @@ select explain_parallel_append(
 select * from listp where a = (select 2);');
                               explain_parallel_append
 -----------------------------------------------------------------------------------
- Append (actual rows=N loops=N)
-   ->  Gather (actual rows=N loops=N)
-         Workers Planned: 2
-         Params Evaluated: $0
-         Workers Launched: N
-         InitPlan 1 (returns $0)
-           ->  Result (actual rows=N loops=N)
-         ->  Parallel Append (actual rows=N loops=N)
-               ->  Seq Scan on listp_12_1 listp_1 (actual rows=N loops=N)
-                     Filter: (a = $0)
-               ->  Parallel Seq Scan on listp_12_2 listp_2 (never executed)
-                     Filter: (a = $0)
-   ->  Gather (actual rows=N loops=N)
-         Workers Planned: 2
-         Params Evaluated: $1
-         Workers Launched: N
-         InitPlan 2 (returns $1)
-           ->  Result (actual rows=N loops=N)
+ Gather (actual rows=N loops=N)
+   Workers Planned: 2
+   Workers Launched: N
+   ->  Parallel Append (actual rows=N loops=N)
          ->  Parallel Append (actual rows=N loops=N)
-               ->  Seq Scan on listp_12_1 listp_4 (never executed)
+               InitPlan 2 (returns $1)
+                 ->  Result (actual rows=N loops=N)
+               ->  Seq Scan on listp_12_1 listp_1 (never executed)
                      Filter: (a = $1)
-               ->  Parallel Seq Scan on listp_12_2 listp_5 (actual rows=N loops=N)
+               ->  Parallel Seq Scan on listp_12_2 listp_2 (actual rows=N loops=N)
                      Filter: (a = $1)
-(23 rows)
+         ->  Parallel Append (actual rows=N loops=N)
+               InitPlan 1 (returns $0)
+                 ->  Result (actual rows=N loops=N)
+               ->  Seq Scan on listp_12_1 listp_4 (actual rows=N loops=N)
+                     Filter: (a = $0)
+               ->  Parallel Seq Scan on listp_12_2 listp_5 (never executed)
+                     Filter: (a = $0)
+(18 rows)

 drop table listp;
 reset parallel_tuple_cost;
--
2.39.3

Re: Allowing parallel-safe initplans

From

Richard Guo

Date:

14 July 2023, 07:35:25

On Fri, Jul 14, 2023 at 5:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I tried both of those and concluded they'd be too messy for a patch
that we might find ourselves having to back-patch. So 0001 attached
fixes it by teaching SS_finalize_plan to treat optimized MIN()/MAX()
aggregates as if they were already Params. It's slightly annoying
to have knowledge of that optimization metastasizing into another
place, but the alternatives are even less palatable.

I tried with 0001 patch and can confirm that the wrong result issue
shown in [1] is fixed.

explain (costs off, verbose) select min(i) from a;
QUERY PLAN
-----------------------------------------------------------
Gather
Output: ($0)
Workers Planned: 1
Params Evaluated: $0 <==== initplan params
Single Copy: true
InitPlan 1 (returns $0)
-> Limit
Output: a.i
-> Index Only Scan using a_i_j_idx on public.a
Output: a.i
Index Cond: (a.i IS NOT NULL)
-> Result
Output: $0
(13 rows)

Now the Gather.initParam is filled and e89a71fb4 does its work to
transmit the Params to workers.

So +1 to 0001 patch.

I'm still resistant to the idea of kluging EXPLAIN to the extent
of hiding the EXPLAIN output changes. It wouldn't be that hard
to do really, but I worry that such a kluge might hide real problems
in future. So what I did in 0002 was to allow initPlans for an
injected Gather only if debug_parallel_query = on, so that there
will be a place for EXPLAIN to show them. Other than the changes
in that area, 0002 is the same as the previous patch.

Also +1 to 0002 patch.

[1] https://www.postgresql.org/message-id/CAMbWs48p-WpnLdR9ZQ4QsHZP_a-P0rktAYo4Z3uOHUAkH3fjQg%40mail.gmail.com

Thanks
Richard

Re: Allowing parallel-safe initplans

From

Tom Lane

Date:

14 July 2023, 15:57:09

Richard Guo <guofenglinux@gmail.com> writes:
> So +1 to 0001 patch.
> Also +1 to 0002 patch.

Pushed, thanks for looking at it!

            regards, tom lane

Re: Allowing parallel-safe initplans

From

Frédéric Yhuel

Date:

02 October 2024, 14:58:45


Le 12/04/2023 à 20:06, Robert Haas a écrit :
>> There's only one existing test case that visibly changes plan with
>> these changes.  The new plan is clearly saner-looking than before,
>> and testing with some data loaded into the table confirms that it
>> is faster.  I'm not sure if it's worth devising more test cases.
> It seems like it would be nice to see one or two additional scenarios
> where these changes bring a benefit, with different kinds of plan
> shapes.

Hi,

Currently working on illustrating some points in the v17 release notes, 
I'm trying to come up with a sexier scenario than the test case, but it 
seems that with a non-trivial InitPlan (2nd explain below), we still 
have a non-parallel Append node at the top:

SET parallel_setup_cost = 0;
SET parallel_tuple_cost = 0;
SET min_parallel_table_scan_size = 10;

CREATE TABLE foo (a int) PARTITION by LIST(a);

CREATE TABLE foo_0 PARTITION OF foo FOR VALUES IN (0);

CREATE TABLE foo_1 PARTITION OF foo FOR VALUES IN (1);

EXPLAIN (COSTS OFF)
    SELECT * FROM foo WHERE a = (SELECT 2)
    UNION ALL
    SELECT * FROM foo WHERE a = 0;

                      QUERY PLAN
-----------------------------------------------------
  Gather
    Workers Planned: 2
    ->  Parallel Append
          ->  Parallel Append
                InitPlan 1
                  ->  Result
                ->  Parallel Seq Scan on foo_0 foo_1
                      Filter: (a = (InitPlan 1).col1)
                ->  Parallel Seq Scan on foo_1 foo_2
                      Filter: (a = (InitPlan 1).col1)
          ->  Parallel Seq Scan on foo_0 foo_3
                Filter: (a = 0)


EXPLAIN (COSTS OFF)
    SELECT * FROM foo WHERE a = (SELECT max(a) FROM foo)
    UNION ALL
    SELECT * FROM foo WHERE a = 0;

                                QUERY PLAN
------------------------------------------------------------------------
  Append
    ->  Gather
          Workers Planned: 2
          InitPlan 1
            ->  Finalize Aggregate
                  ->  Gather
                        Workers Planned: 2
                        ->  Partial Aggregate
                              ->  Parallel Append
                                    ->  Parallel Seq Scan on foo_0 foo_5
                                    ->  Parallel Seq Scan on foo_1 foo_6
          ->  Parallel Append
                ->  Parallel Seq Scan on foo_0 foo_1
                      Filter: (a = (InitPlan 1).col1)
                ->  Parallel Seq Scan on foo_1 foo_2
                      Filter: (a = (InitPlan 1).col1)
    ->  Gather
          Workers Planned: 1
          ->  Parallel Seq Scan on foo_0 foo_3
                Filter: (a = 0)

Did I miss something?

Best regards,
Frédéric