odd buildfarm failure - "pg_ctl: control file appears to be corrupt" - Mailing list pgsql-hackers

Hi,

My buildfarm animal grassquit just showed an odd failure [1] in REL_11_STABLE:

ok 10 - standby is in recovery
# Running: pg_ctl -D
/mnt/resource/bf/build/grassquit/REL_11_STABLE/pgsql.build/src/bin/pg_ctl/tmp_check/t_003_promote_standby2_data/pgdata
promote
waiting for server to promote....pg_ctl: control file appears to be corrupt
not ok 11 - pg_ctl promote of standby runs

#   Failed test 'pg_ctl promote of standby runs'
#   at /mnt/resource/bf/build/grassquit/REL_11_STABLE/pgsql.build/../pgsql/src/test/perl/TestLib.pm line 474.


I didn't find other references to this kind of failure. Nor has the error
re-occurred on grassquit.


I don't immediately see a way for this message to be hit that's not indicating
a bug somewhere. We should be updating the control file in an atomic way and
read it in an atomic way.


The failure has to be happening in wait_for_postmaster_promote(), because the
standby2 is actually successfully promoted.

Greetings,

Andres Freund

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grassquit&dt=2022-11-22%2016%3A33%3A57



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: ssl tests aren't concurrency safe due to get_free_port()
Next
From: Peter Smith
Date:
Subject: Re: Logical replication missing information