Re: Windows vs recovery tests - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Windows vs recovery tests
Date
Msg-id 20220113022526.b63vclpbqlrm7aj2@alap3.anarazel.de
Whole thread Raw
In response to Re: Windows vs recovery tests  (Andres Freund <andres@anarazel.de>)
Responses Re: Windows vs recovery tests
List pgsql-hackers
Hi,

On 2022-01-12 15:58:26 -0800, Andres Freund wrote:
> On 2022-01-12 14:34:00 -0500, Andrew Dunstan wrote:
> > For some considerable time the recovery tests have been at best flaky on
> > Windows, and at worst disastrous (i.e. they can hang rather than just
> > fail). It's a problem I worked around on my buildfarm animals by
> > disabling the tests, hoping to find time to get back to analysing the
> > problem. But now we are seeing failures on the cfbot too (e.g.
> > https://cirrus-ci.com/task/5860692694663168 and
> > https://cirrus-ci.com/task/5316745152954368 ) so I think we need to
> > spend some effort on finding out what's going on here.
> 
> I'm somewhat certain that this is caused by assertions or aborts hanging with
> a GUI popup, e.g. due to a check in the CRT.

Oh, that was only about https://cirrus-ci.com/task/5860692694663168 not
https://cirrus-ci.com/task/5316745152954368

Looking through the recent recovery failures that were just on windows, I see
three different "classes" of recovery test failures:

1) Tests sometimes never finish, resulting in CI timing out
2) Tests sometimes finish, but t/001_stream_rep.pl fails
3) Tests fail with patch specific issues (e.g. 36/2096, 36/3461, 36/3459)

From the cases I looked the failures in 1) always have a successful
t/001_stream_rep.pl. This makes me think that we're likely at two separate
types of problems?


One might think that
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/3464
conflicts with the above grouping. But all but the currently last failure were
due a compiler warning in an earlier version of the patch.


There's one interesting patch that also times out just on windows, albeit in
another test group:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/2096

This IMO looks likely to be a bug in psql introduced by that patch.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: "tanghy.fnst@fujitsu.com"
Date:
Subject: RE: Skipping logical replication transactions on subscriber side
Next
From: Noah Misch
Date:
Subject: Re: null iv parameter passed to combo_init()