On Wed, Oct 13, 2010 at 5:22 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Oct 13, 2010 at 3:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> There's another problem here we should think about, too. Suppose you
>> have a master and two standbys. The master dies. You promote one of
>> the standbys, which turns out to be behind the other. You then
>> repoint the other standby at the one you promoted. Congratulations,
>> your database is now very possible corrupt, and you may very well get
>> no warning of that fact. It seems to me that we would be well-advised
>> to install some kind of bullet-proof safeguard against this kind of
>> problem, so that you will KNOW that the standby needs to be re-synced.
>
> Yep. This is why I said it's not easy to implement that.
>
> To start the standby without taking a base backup from new master after
> failover, the user basically has to promote the standby which is ahead
> of the other standbys (e.g., by comparing pg_last_xlog_replay_location
> on each standby).
>
> As the safeguard, we seem to need to compare the location at the switch
> of the timeline on the master with the last replay location on the standby.
> If the latter location is ahead AND the timeline ID of the standby is not
> the same as that of the master, we should emit warning and terminate the
> replication connection.
That doesn't seem very bullet-proof. You can accidentally corrupt a
standby even when only one time-line is involved. AFAIK, stopping a
standby, removing recovery.conf, and starting it up again does not
change time lines. You can even shut down the standby, bring it up as
a master, generate a little WAL, shut it back down, and bring it back
up as a standby pointing to the same master. It would be nice to
embed in each checkpoint record an identifier that changes randomly on
each transition to normal running, so that if you do something like
this we can notice and complain loudly.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company