Re: failures in t/031_recovery_conflict.pl on CI - Mailing list pgsql-hackers

From Andres Freund
Subject Re: failures in t/031_recovery_conflict.pl on CI
Date
Msg-id 20220508221139.pxo4cp5d34ixtjah@alap3.anarazel.de
Whole thread Raw
In response to Re: failures in t/031_recovery_conflict.pl on CI  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: failures in t/031_recovery_conflict.pl on CI
Re: failures in t/031_recovery_conflict.pl on CI
List pgsql-hackers
Hi,

On 2022-05-08 13:59:09 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-05-08 11:28:34 -0400, Tom Lane wrote:
> >> Per lapwing's latest results [1], this wasn't enough.  I'm again thinking
> >> we should pull the whole test from the back branches.
> 
> > That failure is different from the earlier failures though. I don't think it's
> > a timing issue in the test like the deadlock check one. I rather suspect it's
> > indicative of further problems in this area.
> 
> Yeah, that was my guess too.
> 
> > Potentially the known problem
> > with RecoveryConflictInterrupt() running in the signal handler? I think Thomas
> > has a patch for that...
> 
> Maybe; or given that it's on v10, it could be telling us about some
> yet-other problem we perhaps solved since then without realizing
> it needed to be back-patched.
> 
> > One failure in ~20 runs, on one animal doesn't seem worth disabling the test
> > for.
> 
> No one is going to thank us for shipping a known-unstable test case.

IDK, hiding failures indicating bugs isn't really better, at least if it
doesn't look like a bug in the test. But you seem to have a stronger opinion
on this than me, so I'll skip the entire test for now :/

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: Finer grain log timestamps
Next
From: Thomas Munro
Date:
Subject: Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To: