Re: "could not open file "pg_wal/…": No such file or directory" potential crashing bug due to race condition between restartpoint and recovery - Mailing list pgsql-bugs
From
Thomas Crayford
Subject
Re: "could not open file "pg_wal/…": No such file or directory" potential crashing bug due to race condition between restartpoint and recovery
Ok, thanks for the pointer. It seems like the race condition I talked about is still accurate, does that seem right?
Thanks
Tom
On Mon, Sep 24, 2018 at 4:37 PM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Sep 24, 2018 at 12:58:59PM +0100, Thomas Crayford wrote: > May 20 09:56:14 redacted[9]: [2468859-1] sql_error_code = 00000 LOG: > restored log file "00000002000072B50000003A" from archive > May 20 09:56:14 ip-10-0-92-26 redacted[141]: [191806-1] sql_error_code = > 58P01 ERROR: could not open file "pg_wal/00000002000072B50000003A": No such > file or directory
What kind of restore_command is used here?
> Looking at the code, I think that the two racing functions are > RestoreArchivedFile, and CreateRestartPoint. > > The former calls unlink on the wal segment, CreateRestartPoint does attempt > to do recycling on segments.
Don't you mean KeepFileRestoredFromArchive()? RestoreArchivedFile would call unlink() on pg_wal/RECOVERYXLOG so that does not match. -- Michael