Re: [sqlsmith] Failed assertion in parallel worker in ExecInitSubPlan - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [sqlsmith] Failed assertion in parallel worker in ExecInitSubPlan
Date
Msg-id 5416.1480109679@sss.pgh.pa.us
Whole thread Raw
In response to Re: [sqlsmith] Failed assertion in parallel worker in ExecInitSubPlan  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Fri, Nov 25, 2016 at 10:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The impression I got in poking at this for a few minutes, before
>> going off to stuff myself with turkey, was that we were allowing
>> a SubqueryScanPath to mark itself as parallel-safe even though the
>> resulting plan node would have an InitPlan attached.  So my thought
>> about fixing it was along the lines of if-subroot-contains-initplans-
>> then-dont-let-SubqueryScanPath-be-parallel-safe.

> I think this will work for the reported case, see the patch attached.

After further thought, it seems to me that the fundamental error here
is that we're not accounting for initPlans while marking paths as
parallel safe or not.  They should be correctly marked when they come
out of subquery_planner, rather than expecting callers to know that
they can't trust that marking.  That way, we don't need to do anything
special in create_subqueryscan_path, and we can also get rid of that
kluge in the force_parallel_mode logic.

> However, don't we need to worry about if there is a subplan
> (non-initplan) instead of initplan.

I don't think so.  References to subplans are already known to be parallel
restricted.  The issue that we're fixing here is that if a plan node has
initPlans attached, ExecutorStart will try to access those subplans,
whether or not they actually get used while running.  That's why the plan
node itself has to be marked parallel-unsafe so it won't get shipped to
parallel workers.

>> There's another issue here, which is that the InitPlan is derived from an
>> outer join qual whose outer join seems to have been deleted entirely by
>> analyzejoins.c.  Up to now it was possible to avert our eyes from the fact
>> that join removal is lazy about getting rid of unused InitPlans, but if
>> they're going to disable parallelization of the surrounding query, maybe
>> we need to work harder on that.

> Yeah, that makes sense, but not sure whether we should try it along
> with this patch.

I'm not certain that sublinks in deletable outer join quals is a case
important enough to sweat over.  But this is the first time I've realized
that failure to remove those initPlans is capable of making a plan worse.
So it's something to keep an eye on.  (I still wish that we could make
join removal happen during join tree preprocessing; doing it where it is
is nothing but a kluge, with unpleasant side effects like this one.)
        regards, tom lane



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: UNDO and in-place update
Next
From: Fabien COELHO
Date:
Subject: Re: confusing checkpoint_flush_after / bgwriter_flush_after