On Thu, May 16, 2024 at 12:19:22PM -0400, Melanie Plageman wrote:
> Today, after committing a3e6c6f, I saw recovery/018_wal_optimize.pl
> fail and see this message in the replica log [2].
> 
> 2024-05-16 15:12:22.821 GMT [5440][not initialized] FATAL:  incorrect
> checksum in control file
> 
> I'm pretty sure it's not related to my commit. So, I was looking for
> existing reports of this error message.
Yeah, I don't see how it could be related.
> It's a long shot, since 0001 and 0002 were already pushed, but this is
> the only recent report I could find of "FATAL:  incorrect checksum in
> control file" in pgsql-hackers or bugs archives.
> 
> I do see this thread from 2016 [3] which might be relevant because the
> reported bug was also on Windows.
I suspect it will be difficult to investigate this one too much further
unless we can track down a copy of the control file with the bad checksum.
Other than searching for any new code that isn't doing the appropriate
locking, maybe we could search the buildfarm for any other occurrences.  I
also seem some threads concerning whether the way we are reading/writing
the control file is atomic.
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com