Hot Standby Failover Scenario - Mailing list pgsql-hackers

From Lucky Haryadi
Subject Hot Standby Failover Scenario
Date
Msg-id CABGr5caenC-PZwZ=OtaTpXONKyazqUX_LAGX3A8xWd27X3nSFA@mail.gmail.com
Whole thread Raw
Responses Re: Hot Standby Failover Scenario  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-hackers
Hi everybody.

I want to ask about hot-standby related issues. First of all, maybe I will describe my scenario of Postgres master-slave.

1. There are Master A and Slave B in different location, assumed different region of nation.
2. Configuring Master A and Slave B to become hot-standby is same as described in documentations.
3. When Master A fails to service, the database will failovered to Slave B by triggering with trigger file.
4. As soon as Slave B become standalone pg server, run pg_start_backup(), so that all transactions will only be recorded to WAL files.
5. Applications swinged to Standalone B, until Server A recovery is done. 
6. When Server A has recovered (but still offline), run pg_stop_backup() and copy all WAL files from B to A.
7. Once the WAL files copied to A, set A's configuration back to Master and B to Slave again (for B, change recovery.done to recovery.conf and remove the trigger file).
8. Bring up A, restart B and all applications will be swinged back to A.

I've tried these methods with no luck. Before A fails to service, condition is A has 10 million records, and B has 10 million records too. Then I failovered to B, manually, simulating that A failed to service. I run pg_start_backup() and inserting bunch of data, let say the current condition is A still 10 million, B 20 million. So I tried to copy WAL files from B to A and hope that when A up again, the records will intact to B, A 20 million and B 20 million and hot-standby streaming will run as well. But my experiments failed to do so.
I've checked the log and found that the timeline is invalid. On Slave B's log, it appeared that timeline of primary server (Master A) does not match target timeline of standby server. Can anyone suggest for this case? Any suggestions will be greatly appreciated. Thank you.

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Speed dblink using alternate libpq tuple storage
Next
From: Greg Smith
Date:
Subject: Re: swapcache-style cache?