On Wed, Jun 7, 2017 at 12:37 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Jun 6, 2017 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> One thought is that the only places where shm_mq_set_sender() should
>>> be getting invoked during the main regression tests are
>>> ParallelWorkerMain() and ExecParallelGetReceiver, and both of those
>>> places using ParallelWorkerNumber to figure out what address to pass.
>>> So if ParallelWorkerNumber were getting set to the same value in two
>>> different parallel workers - e.g. because the postmaster went nuts and
>>> launched two processes instead of only one - or if
>>> ParallelWorkerNumber were not getting initialized at all or were
>>> getting initialized to some completely bogus value, it could cause
>>> this symptom.
>>
>> Hmm. With some generous assumptions it'd be possible to think that
>> aa1351f1eec4adae39be59ce9a21410f9dd42118 triggered this. That commit was
>> present in 20 successful lorikeet runs before the first of these failures,
>> which is a bit more than the MTBF after that, but not a huge amount more.
>>
>> That commit in itself looks innocent enough, but could it have exposed
>> some latent bug in bgworker launching?
>
> Hmm, that's a really interesting idea, but I can't quite put together
> a plausible theory around it. I mean, it seems like that commit could
> make launching bgworkers faster, which could conceivably tickle some
> heretofore-latent timing-related bug. But it wouldn't, IIUC, make the
> first worker start any faster than before - it would just make them
> more closely-spaced thereafter, and it's not very obvious how that
> would cause a problem.
>
> Another idea is that the commit in question is managing to corrupt
> BackgroundWorkerList somehow.
>
I don't think so because this problem has been reported previously as
well [1][2] even before the commit in question.
[1] - https://www.postgresql.org/message-id/1ce5a19f-3b1d-bb1c-4561-0158176f65f1%40dunslane.net
[2] - https://www.postgresql.org/message-id/25861.1472215822%40sss.pgh.pa.us
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com