Thread: Can there ever be out of sequence WAL files?
Hi, Can the postgres server ever have/generate out of sequence WAL files? For instance, 000000010000020C000000A2, 000000010000020C000000A3, 000000010000020C000000A5 and so on, missing 000000010000020C000000A4. Manual/Accidental deletion of the WAL files can happes, but are there any other extreme situations (like recycling, removing old WAL files etc.) caused by the postgres server leading to missing WAL files? What happens when postgres server finds missing WAL file during crash/standby recovery? Thoughts? Regards, Bharath Rupireddy.
On Tue, Dec 28, 2021 at 7:45 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Hi, > > Can the postgres server ever have/generate out of sequence WAL files? > For instance, 000000010000020C000000A2, 000000010000020C000000A3, > 000000010000020C000000A5 and so on, missing 000000010000020C000000A4. > Manual/Accidental deletion of the WAL files can happes, but are there > any other extreme situations (like recycling, removing old WAL files > etc.) caused by the postgres server leading to missing WAL files? > > What happens when postgres server finds missing WAL file during > crash/standby recovery? > > Thoughts? Hi Hackers, a gentle ping for the above question. I think I sent it earlier during the holiday season. Regards, Bharath Rupireddy.
On Wed, Jan 12, 2022 at 07:19:48AM +0530, Bharath Rupireddy wrote: > > > > Can the postgres server ever have/generate out of sequence WAL files? > > For instance, 000000010000020C000000A2, 000000010000020C000000A3, > > 000000010000020C000000A5 and so on, missing 000000010000020C000000A4. > > Manual/Accidental deletion of the WAL files can happes, but are there > > any other extreme situations (like recycling, removing old WAL files > > etc.) caused by the postgres server leading to missing WAL files? By definition there shouldn't be such situation, as it would otherwise be a (critical) bug. > > What happens when postgres server finds missing WAL file during > > crash/standby recovery? The recovery should fail.
On Wed, Jan 12, 2022 at 10:18:11AM +0800, Julien Rouhaud wrote: > On Wed, Jan 12, 2022 at 07:19:48AM +0530, Bharath Rupireddy wrote: >>> Can the postgres server ever have/generate out of sequence WAL files? >>> For instance, 000000010000020C000000A2, 000000010000020C000000A3, >>> 000000010000020C000000A5 and so on, missing 000000010000020C000000A4. >>> Manual/Accidental deletion of the WAL files can happes, but are there >>> any other extreme situations (like recycling, removing old WAL files >>> etc.) caused by the postgres server leading to missing WAL files? > > By definition there shouldn't be such situation, as it would otherwise be a > (critical) bug. I have seen that in the past, in cases where a system got harshly deplugged then replugged where a segment file flush got missing. But that was just a flacky system, Postgres relied just on something wrong. So the answer is that this should not happen. >>> What happens when postgres server finds missing WAL file during >>> crash/standby recovery? > > The recovery should fail. xlog.c can be a good read to check the assumptions WAL replay relies on, with things like CheckRecoveryConsistency() or reachedConsistency. -- Michael
Attachment
On Wed, Jan 12, 2022 at 01:10:25PM +0900, Michael Paquier wrote: > > xlog.c can be a good read to check the assumptions WAL replay relies > on, with things like CheckRecoveryConsistency() or > reachedConsistency. That should only stand for a WAL expected to be missing right? For something unexpected it should fail in XLogReadRecord() when trying to fetch a missing block?
On Wed, Jan 12, 2022 at 12:23:00PM +0800, Julien Rouhaud wrote: > On Wed, Jan 12, 2022 at 01:10:25PM +0900, Michael Paquier wrote: >> xlog.c can be a good read to check the assumptions WAL replay relies >> on, with things like CheckRecoveryConsistency() or >> reachedConsistency. > > That should only stand for a WAL expected to be missing right? For something > unexpected it should fail in XLogReadRecord() when trying to fetch a missing > block? Sure, as well as there are sanity checks related to invalid page references when it comes to the consistency checks. -- Michael