Hi,
On 2022-01-12 15:58:26 -0800, Andres Freund wrote:
> On 2022-01-12 14:34:00 -0500, Andrew Dunstan wrote:
> > For some considerable time the recovery tests have been at best flaky on
> > Windows, and at worst disastrous (i.e. they can hang rather than just
> > fail). It's a problem I worked around on my buildfarm animals by
> > disabling the tests, hoping to find time to get back to analysing the
> > problem. But now we are seeing failures on the cfbot too (e.g.
> > https://cirrus-ci.com/task/5860692694663168 and
> > https://cirrus-ci.com/task/5316745152954368 ) so I think we need to
> > spend some effort on finding out what's going on here.
>
> I'm somewhat certain that this is caused by assertions or aborts hanging with
> a GUI popup, e.g. due to a check in the CRT.
Oh, that was only about https://cirrus-ci.com/task/5860692694663168 not
https://cirrus-ci.com/task/5316745152954368
Looking through the recent recovery failures that were just on windows, I see
three different "classes" of recovery test failures:
1) Tests sometimes never finish, resulting in CI timing out
2) Tests sometimes finish, but t/001_stream_rep.pl fails
3) Tests fail with patch specific issues (e.g. 36/2096, 36/3461, 36/3459)
From the cases I looked the failures in 1) always have a successful
t/001_stream_rep.pl. This makes me think that we're likely at two separate
types of problems?
One might think that
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/3464
conflicts with the above grouping. But all but the currently last failure were
due a compiler warning in an earlier version of the patch.
There's one interesting patch that also times out just on windows, albeit in
another test group:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/2096
This IMO looks likely to be a bug in psql introduced by that patch.
Greetings,
Andres Freund