Re: v12: ERROR: subplan "InitPlan 2 (returns $4)" was not initialized - Mailing list pgsql-hackers

From Tom Lane
Subject Re: v12: ERROR: subplan "InitPlan 2 (returns $4)" was not initialized
Date
Msg-id 507872.1681243148@sss.pgh.pa.us
Whole thread Raw
In response to v12: ERROR: subplan "InitPlan 2 (returns $4)" was not initialized  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: v12: ERROR: subplan "InitPlan 2 (returns $4)" was not initialized
List pgsql-hackers
Justin Pryzby <pryzby@telsasoft.com> writes:
> postgres=# SET force_parallel_mode =1; CREATE TABLE x (i int) PARTITION BY RANGE (i); CREATE TABLE x1 PARTITION OF x
DEFAULT; 
>  select * from pg_class,
>   lateral (select pg_catalog.bit_and(1)
>       from pg_class as sample_1
>       where case when EXISTS (
>             select 1 from x
>               where EXISTS (
>                 select 1 from pg_catalog.pg_depend
>                   where (sample_1.reltuples is NULL)
>                   )) then 1 end
>            is NULL)x
> where false;

Interesting.  The proximate cause is that we end up with a subplan
that is marked parallel_safe, but it has an initplan that is not
parallel_safe.  The parallel worker receives, and tries to initialize,
the parallel_safe subplan, and falls over because of its reference
to the unsafe subplan -- which was not transmitted to the worker.

Actually, because of the policy installed by commit ab77a5a45, the
mere fact of having an initplan should be enough to disqualify
the first subplan from being marked parallel-safe.

I dug around and found the culprit: setrefs.c's
clean_up_removed_plan_level() moves initplans down from a parent
to a child plan node, but it forgot the possibility that the
child plan node had been marked parallel_safe before that and
must not be anymore.

The v1 patch attached is enough to fix the immediate issue,
but there's another thing not to like, which is that we're also
discarding the costs associated with the initplans.  That's
strictly cosmetic given that all the planning decisions are
already made, but it still seems potentially annoying if you're
trying to understand EXPLAIN output.  So I'm inclined to instead
do something like v2 attached, which deals with that as well.
(On the other hand, we aren't bothering to fix up costs when
we move initplans around in materialize_finished_plan or
standard_planner ... so maybe that should be left for a patch
that fixes those things too.)

Another thing worth wondering about is whether we can't loosen
commit ab77a5a45's policy that having an initplan is enough
to make you parallel-unsafe.  In the wake of later fixes,
notably 5e6d8d2bb, it seems like maybe we could allow that
as long as the initplans themselves are parallel-safe.  That
wouldn't be material for back-patching though, so I'll worry
about it later.

Not sure what if anything to do about a test case.  I'm not
excited about memorializing the specific case found by sqlsmith,
because it seems only very accidental that it exposes this
problem.  I found that there are existing regression tests
that exercise the situation where clean_up_removed_plan_level
generates an incorrectly-marked plan, but there is accidentally
no bad effect.  (The planner itself isn't going to be making
any further decisions with the bogus info; it's only
ExecSerializePlan that pays attention to the flag, and we'd only
notice in this specific cross-reference situation.)  Also, any
change we make along the lines speculated about in the previous
para would be highly likely to break a test case, in the sense
that it'd no longer exercise the previously-failing scenario.
So on the whole I'm inclined not to bother with a new test case.

            regards, tom lane

diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b56666398e..042036b11b 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1546,6 +1546,10 @@ clean_up_removed_plan_level(Plan *parent, Plan *child)
     child->initPlan = list_concat(parent->initPlan,
                                   child->initPlan);

+    /* If we moved down any initplans, child is no longer parallel-safe */
+    if (child->initPlan)
+        child->parallel_safe = false;
+
     /*
      * We also have to transfer the parent's column labeling info into the
      * child, else columns sent to client will be improperly labeled if this
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b56666398e..9a7c8ea07b 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1542,7 +1542,31 @@ trivial_subqueryscan(SubqueryScan *plan)
 static Plan *
 clean_up_removed_plan_level(Plan *parent, Plan *child)
 {
-    /* We have to be sure we don't lose any initplans */
+    ListCell   *lc;
+
+    /*
+     * We have to be sure we don't lose any initplans, so move any that were
+     * attached to the parent plan to the child.  If we do move any, the child
+     * is no longer parallel-safe.  As a cosmetic matter, also add the
+     * initplans' run costs to the child's costs (compare
+     * SS_charge_for_initplans).
+     */
+    foreach(lc, parent->initPlan)
+    {
+        SubPlan    *initsubplan = (SubPlan *) lfirst(lc);
+        Cost        initplan_cost;
+
+        child->parallel_safe = false;
+        initplan_cost = initsubplan->startup_cost + initsubplan->per_call_cost;
+        child->startup_cost += initplan_cost;
+        child->total_cost += initplan_cost;
+    }
+
+    /*
+     * Attach plans this way so that parent's initplans are processed before
+     * any pre-existing initplans of the child.  Probably doesn't matter, but
+     * let's preserve the ordering just in case.
+     */
     child->initPlan = list_concat(parent->initPlan,
                                   child->initPlan);


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Assertion being hit during WAL replay
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] Use role name "system_user" instead of "user" for unsafe_tests