Re: Why standby restores some WALs many times from archive? - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Why standby restores some WALs many times from archive?
Date
Msg-id CAMkU=1wkV-Kp2XeWSWH5Kn=eUJt2di0vHAp=MGuOLbkGSyMK3A@mail.gmail.com
Whole thread Raw
In response to Re: Why standby restores some WALs many times from archive?  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Sat, Dec 30, 2017 at 4:20 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Sat, Dec 30, 2017 at 04:30:07AM +0300, Sergey Burladyan wrote:
> We use this scripts:
> https://github.com/avito-tech/dba-utils/tree/master/pg_archive
>
> But I can reproduce problem with simple cp & mv:
> archive_command:
>   test ! -f /var/lib/postgresql/wals/%f && \
>   test ! -f /var/lib/postgresql/wals/%f.tmp && \
>   cp %p /var/lib/postgresql/wals/%f.tmp && \
>   mv /var/lib/postgresql/wals/%f.tmp /var/lib/postgresql/wals/%f

This is unsafe. PostgreSQL expects the WAL segment archived to be
flushed to disk once the archive command has returned its result to the
backend. Don't be surprised if you get corrupted instances or that you
are not able to recover up to a consistent point if you need to roll in
a backup. Note that the documentation of PostgreSQL provides a simple
example of archive command, which is itself bad enough not to use.

True, but that but doesn't explain the current situation, as it reproduces without an OS level crash so a missing sync would not be relevant. (and on some systems, mv'ing a file will force it to be synced under some conditions, so it might be safe anyway)

I thought I'd seen something recently in the mail lists or commit log about an off-by-one error which causes it to re-fetch the previous file rather than the current file if the previous file ends with just the right type of record and amount of padding.  But now I can't find it. 

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Dubious shortcut in ckpt_buforder_comparator()
Next
From: Chapman Flack
Date:
Subject: Re: let's make the list of reportable GUCs configurable (was Re: Add%r substitution for psql prompts to show recovery status)