Re: Race condition in crash-recovery tests - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Race condition in crash-recovery tests
Date
Msg-id 21941.1548557111@sss.pgh.pa.us
Whole thread Raw
In response to Re: Race condition in crash-recovery tests  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2019-01-26 20:53:48 -0500, Tom Lane wrote:
>> I have no idea why we're seeing this in only one buildfarm member
>> and only for the past week or so, as it doesn't appear that any
>> related code has changed for months.  (Perhaps something changed
>> about curculio's host?)

> I have no idea why it's just curculio, but I think I know why it only
> started recently: Curculio doesn't appear to have tap tests enabled
> before
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio&dt=2019-01-17%2021%3A30%3A02

Oh, right ... I knew that, actually, but forgot ...

So then we only have to assume that the race condition is encouraged
by something about the kernel scheduler's rules on that machine, which
isn't so much of a leap, especially since it's our only OpenBSD
critter.  The test case only exists in v11 and HEAD branches, and
curculio's only run this test a few times in v11, so the lack of
back-branch failures isn't so odd.

>> just change the test script to accept either message as a successful
>> result.  I think that 4247db625 made such races more likely, but I
>> don't believe it was impossible before.

> Sounds right to me - do you want to do the honors or shall I?

I'll do it in a bit.

            regards, tom lane


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Variable-length FunctionCallInfoData
Next
From: Tom Lane
Date:
Subject: Re: Variable-length FunctionCallInfoData