Re: Assertion failure with barriers in parallel hash join - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Assertion failure with barriers in parallel hash join
Date
Msg-id CAAKRu_a8nn9xZbC3Y5VPDubfgCepm0H0i94Xm6ymuMvzvwmvHg@mail.gmail.com
Whole thread Raw
In response to Re: Assertion failure with barriers in parallel hash join  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Assertion failure with barriers in parallel hash join  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers

On Thu, Oct 1, 2020 at 8:08 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Sep 29, 2020 at 9:12 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Tue, Sep 29, 2020 at 7:11 PM Michael Paquier <michael@paquier.xyz> wrote:
> > #2  0x00000000009027d2 in ExceptionalCondition
> > (conditionName=conditionName@entry=0xa80846 "!barrier->static_party",
>
> > #4  0x0000000000682ebf in ExecParallelHashJoinNewBatch
>
> Thanks.  Ohhh.  I think I see how that condition was reached and what
> to do about it, but I'll need to look more closely.  I'm away on
> vacation right now, and will update in a couple of days when I'm back
> at a real computer.

Here's a throw-away patch to add some sleeps that trigger the problem,
and a first draft fix.  I'll do some more testing of this next week
and see if I can simplify it.

I was just taking a look at the patch and noticed the commit message
says:

> With unlucky timing and parallel_leader_participation off...

Is parallel_leader_participation being off required to reproduce the
issue?

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: [patch] [doc] Clarify that signal functions have no feedback
Next
From: Thomas Munro
Date:
Subject: Re: Assertion failure with barriers in parallel hash join