Re: [BUG] Archive recovery failure on 9.3+. - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [BUG] Archive recovery failure on 9.3+.
Date
Msg-id 52FC7E2A.9060703@vmware.com
Whole thread Raw
In response to Re: [BUG] Archive recovery failure on 9.3+.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: [BUG] Archive recovery failure on 9.3+.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
On 02/13/2014 08:44 AM, Kyotaro HORIGUCHI wrote:
>>>>> Wouldn't it be better to not archive the old segment, and instead
>>>>> switch
>>>>> to a new segment after writing the end-of-recovery checkpoint, so that
>>>>> the segment on the new timeline is archived sooner?
>>>>
>>>> It would be better to zero-fill and switch segments, yes. We should
>>>> NEVER be in a position of archiving two different versions of the same
>>>> segment.
>>>
>>> Ok, I think we're in agreement that that's the way to go for master.
>
> I've almost inclined to that but on some thoughts on the idea,
> comming to think of recovery upto target timeline, the old
> segment found to be necessary for the case. Without the old
> segment, we would be obliged to seek to the first segment of the
> *next* timeline (Is there any (simple) means to predict where is
> it?) to complete the task.

How did the server that created the new timeline get the old, partial, 
segment? Was it already archived? Or did the DBA copy it into pg_xlog 
manually? Or was it streamed by streaming replication? Whatever the 
mechanism, the same mechanism ought to make sure the old segment is 
available for PITR, too.

Hmm. If you have set up streaming replication and a WAL archive, and 
your master dies and you fail over to a standby, what you describe does 
happen. The partial old segment is not in the archive, so you cannot 
PITR to a point in the old timeline that falls within the partial 
segment (ie. just before the failover). However, it's not guaranteed 
that all the preceding WAL segments on the old timeline were already 
archived, anyway, so even if the partial segment is archived, it's not 
guaranteed to work.

The old master is responsible for archiving the WAL on the old timeline, 
and the new master is responsible for archiving all the WAL on the new 
timeline. That's a straightforward, easy-to-understand rule. It might be 
useful to have a mode where the standby also archives all the received 
WAL, but that would need to be a separate option.

> Is it the right way we kick the older one out of archive?

If it's already in the archive, it's not going to be removed from the 
archive.

- Heikki



pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: narwhal and PGDLLIMPORT
Next
From: Andrea Suisani
Date:
Subject: Re: Recovery inconsistencies, standby much larger than primary