Thread: warm standby and reciprocating failover

warm standby and reciprocating failover

From

james bardin

Date:

24 August 2009, 13:45:59

I wasn't sure which list is better suited, so this is cross posted
from pgsql-admin.
-Thanks

On Fri, Aug 21, 2009 at 10:46 AM, james bardin<jbardin@bu.edu> wrote:
> I have a working warm standby system, running 8.4 (thanks for urging
> me to upgrade from the rehdat provided release).
> One of the new requirements is going to be for (a non-DBA) admin to
> easily swap services between the two servers for maintenance.
>
> The first move runs easily as expected- postgres ships the last
> partial wal immediately on shutdown, trigger the standby and we're up.
> I'm now running into issues bringing the first server back up in
> standby mode. After the second server finishes recovery, the major
> number of the wal files is incremented (say from  00000001 to
> 00000002), and the 00000002.history file is shipped back to the first
> server. The first server however is still looking for 00000001x files.
>
> Is there a way to ship back the missing information from the recovery
> process, without doing another base backup of data/ ?


On Mon, Aug 24, 2009 at 11:34 AM, james bardin<jbardin@bu.edu> wrote:
> So I've been experimenting with this timeline problem without any success.
> Is it possible that there are changes made during recovery that aren't logged?
>
>
> I tried recovery_target_timeline='X' on the standby, where X is the
> new timeline created after recovery on the new master. This fails,
> with some "unexpected timeline ID" lines and a
> PANIC:  could not locate a valid checkpoint record
>
> I also tried using recovery_target_timeline='latest'. This fell back
> gracefully to an earlier state, but changes were lost. Also, it never
> waited on pg_standby, and finished recovering immediately.
>
> Although it doesn't solve this problem, can pg_standby be used with
> recovery_target_timeline='latest', or should I file a bug?
>
> Thanks
> -jim

Re: warm standby and reciprocating failover

From

james bardin

Date:

25 August 2009, 13:59:00

On Mon, Aug 24, 2009 at 12:45 PM, james bardin<jbardin@bu.edu> wrote:
>>
>> I tried recovery_target_timeline='X' on the standby, where X is the
>> new timeline created after recovery on the new master. This fails,
>> with some "unexpected timeline ID" lines and a
>> PANIC:  could not locate a valid checkpoint record
>>
>> I also tried using recovery_target_timeline='latest'. This fell back
>> gracefully to an earlier state, but changes were lost. Also, it never
>> waited on pg_standby, and finished recovering immediately.


It seems that this is related the the issue in this bug report:
http://archives.postgresql.org/pgsql-bugs/2009-05/msg00060.php

The follow up is very long, and I couldn't formulate any workaround
for the issue.