Re: "could not open file "pg_wal/…": No such file or directory" potential crashing bug due to race condition betweenrestartpoint and recovery - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: "could not open file "pg_wal/…": No such file or directory" potential crashing bug due to race condition betweenrestartpoint and recovery
Date
Msg-id 20180928225917.GB1823@paquier.xyz
Whole thread Raw
In response to Re: "could not open file "pg_wal/…": No such file or directory" potential crashing bug due to race condition between restartpoint and recovery  (Thomas Crayford <tcrayford@salesforce.com>)
Responses Re: "could not open file "pg_wal/…": No such file or directory" potential crashing bug due to race condition between restartpoint and recovery
List pgsql-bugs
On Fri, Sep 28, 2018 at 01:02:42PM +0100, Thomas Crayford wrote:
> Ok, thanks for the pointer. It seems like the race condition I talked about
> is still accurate, does that seem right?

KeepFileRestoredFromArchive() looks like a good candidate on the matter
as it removes a WAL segment before replacing it by another with the same
name.  I have a hard time understanding why the checkpointer would try
to recycle a segment just recovered though as the startup process would
immediately try to use it.  I have not spent more than one hour looking
at potential spots though, which is not much for this kind of race
conditions.

It is also why I am curious about what kind of restore_command you are
using.
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Pradeep Singh
Date:
Subject: Regarding "BUG #3995: pqSocketCheck doesn't return"
Next
From: PG Bug reporting form
Date:
Subject: BUG #15411: Unable to uninstall