Re: [HACKERS] Instability in select_parallel regression test - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Instability in select_parallel regression test
Date
Msg-id CAA4eK1KAoMtX_SYsNEsyQ3Ld8on79Ci4c_OVc6CwSbGSXy=+Nw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Instability in select_parallel regression test  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Instability in select_parallel regression test  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Sun, Feb 19, 2017 at 8:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Feb 19, 2017 at 6:50 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> To close the remaining gap, don't you think we can check slot->in_use
>> flag when generation number for handle and slot are same.
>
> That doesn't completely fix it either, because
> ForgetBackgroundWorker() also does
> BackgroundWorkerData->parallel_terminate_count++, which we might also
> fail to see, which would cause RegisterDynamicBackgroundWorker() to
> bail out early.  There are CPU ordering effects to think about here,
> not just the order in which the operations are actually performed.
>

Sure, I think we can attempt to fix that as well by adding write
memory barrier in ForgetBackgroundWorker().  The main point is if we
keep any loose end in this area, then there is a chance that the
regression test select_parallel can still fail, if not now, then in
future.  Another way could be that we can try to minimize the race
condition here and then adjust the select_parallel as suggested above
so that we don't see this failure.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: [HACKERS] dropping partitioned tables without CASCADE
Next
From: Amit Langote
Date:
Subject: Re: [HACKERS] Documentation improvements for partitioning