Re: [BUG] Archive recovery failure on 9.3+. - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [BUG] Archive recovery failure on 9.3+.
Date
Msg-id 52FD04C2.4060701@vmware.com
Whole thread Raw
In response to Re: [BUG] Archive recovery failure on 9.3+.  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: [BUG] Archive recovery failure on 9.3+.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
On 02/13/2014 06:47 PM, Heikki Linnakangas wrote:
> On 02/13/2014 02:42 PM, Heikki Linnakangas wrote:
>> The behavior where we prefer a segment from archive with lower TLI over
>> a file with higher TLI in pg_xlog actually changed in commit
>> a068c391ab0. Arguably changing it wasn't a good idea, but the problem
>> your test script demonstrates can be fixed by not archiving the partial
>> segment, with no change to the preference of archive/pg_xlog. As
>> discussed, archiving a partial segment seems like a bad idea anyway, so
>> let's just stop doing that.
>
> After some further thought, while not archiving the partial segment
> fixes your test script, it's not enough to fix all variants of the
> problem. Even if archive recovery doesn't archive the last, partial,
> segment, if the original master server is still running, it's entirely
> possible that it fills the segment and archives it. In that case,
> archive recovery will again prefer the archived segment with lower TLI
> over the segment with newer TLI in pg_xlog.
>
> So I agree we should commit the patch you posted (or something to that
> effect). The change to not archive the last segment still seems like a
> good idea, but perhaps we should only do that in master.

To draw this to conclusion, barring any further insights to this, I'm
going to commit the attached patch to master and REL9_3_STABLE. Please
have a look at the patch, to see if I'm missing something. I modified
the state machine to skip over XLOG_FROM_XLOG state, if reading in
XLOG_FROM_ARCHIVE failed; otherwise you first scan the archive and
pg_xlog together, and then pg_xlog alone, which is pointless.

In master, I'm also going to remove the "archive last segment on old
timeline" code.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: truncating pg_multixact/members
Next
From: Andres Freund
Date:
Subject: Re: truncating pg_multixact/members