Re: [sqlsmith] Failed assertion in BecomeLockGroupLeader - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [sqlsmith] Failed assertion in BecomeLockGroupLeader
Date
Msg-id CAA4eK1+Lti8oDExyQpKLgh0=JhUHHd_P4VpEB4SZ2O+x8M8qLA@mail.gmail.com
Whole thread Raw
In response to Re: [sqlsmith] Failed assertion in BecomeLockGroupLeader  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [sqlsmith] Failed assertion in BecomeLockGroupLeader
List pgsql-hackers
On Fri, Apr 29, 2016 at 7:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > On Fri, Apr 29, 2016 at 12:01 PM, Andreas Seltenreich <seltenreich@gmx.de>
> > wrote:
> >> tonight's sqlsmith run yielded another core dump:
> >>
> >> TRAP: FailedAssertion("!(MyProc->lockGroupLeader == ((void *)0))", File:
> >> "proc.c", Line: 1787)
> >>
> >> I couldn't identifiy a query for it though: debug_query_string is empty.
> >> Additionally, the offending query was not reported in the error context
> >> as it typically is for non-parallel executor crashes.
>
> > From callstack below, it is clear that the reason for core dump is that
> > Gather node is pushed below another Gather node which makes worker execute
> > the Gather node.  Currently there is no support in workers to launch
> > another workers and ideally such a plan should not be generated.
>
> It might not be intentional.  The bug we identified from Andreas' prior
> report could be causing this: once a GatherPath's subpath has been freed,
> that palloc chunk could be recycled into another GatherPath, or something
> with a GatherPath in its substructure, leading to a plan of that shape.
>

Yes, thats one possibility.

> > It will
> > be helpful if you can find the offending query or plan corresponding to it?
>
> I presume the lack of debug_query_string data is because nothing is
> bothering to set debug_query_string in a worker process.  Should that be
> remedied?  At the very least set it to "worker process",
>

Currently for the purpose of query descriptor in worker process, we are using "<parallel_query>" (refer function ExecParallelGetQueryDesc()), so that seems to be a better choice. 

> but it might be
> worth copying over the full query from the parent side.
>

That would amount to couple of extra cycles considering we need to do it for each worker, but OTOH it might be a useful debugging information in the cases as reported in this thread.  Do you see any broader use of passing query string to worker?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [sqlsmith] Crash in apply_projection_to_path
Next
From: Amit Kapila
Date:
Subject: Re: [sqlsmith] Failed assertion in BecomeLockGroupLeader