Thread: Promoting sync slave to master without incrementing timeline counter?
Hi all,
Given a cluster of three database servers running 9.1.3 (master, sync slave, async slave), it seems that there are two ways to promote the sync slave to become master:
1. pg_ctl promote the sync slave (increments timeline counter)
2. remove recovery.conf on the sync slave and pg_ctl restart (does not increment timeline counter)
The sync slave becomes master more quickly using `pg_ctl promote`, but now every server in the cluster has to take a new base backup due to the incremented timeline.
2ndQuadrant's repmgr uses the second option so that the async slave can "follow" the new master, saving you from having to do a new base backup. Additionally, the old master is able to start streaming replication from the new master without a new base backup. (Repmgr does not actually support the latter behavior out of the box, but it seemed to work.)
So, given a hard failure (i.e. power loss) of the master, `pg_ctl promote` provides availability more quickly, but `pg_ctl restart` provides data redundancy more quickly. Is this an accurate assessment of the tradeoffs between the two approaches? I've found talk on the mailings lists surrounding future support for slaves following timelines after a new master completes recovery, but I have been unable to find anything discussing the approach used by repmgr. Are there risks associated with the `pg_ctl restart` approach, or is it safe to use?
Cheers,
Dave
Given a cluster of three database servers running 9.1.3 (master, sync slave, async slave), it seems that there are two ways to promote the sync slave to become master:
1. pg_ctl promote the sync slave (increments timeline counter)
2. remove recovery.conf on the sync slave and pg_ctl restart (does not increment timeline counter)
The sync slave becomes master more quickly using `pg_ctl promote`, but now every server in the cluster has to take a new base backup due to the incremented timeline.
2ndQuadrant's repmgr uses the second option so that the async slave can "follow" the new master, saving you from having to do a new base backup. Additionally, the old master is able to start streaming replication from the new master without a new base backup. (Repmgr does not actually support the latter behavior out of the box, but it seemed to work.)
So, given a hard failure (i.e. power loss) of the master, `pg_ctl promote` provides availability more quickly, but `pg_ctl restart` provides data redundancy more quickly. Is this an accurate assessment of the tradeoffs between the two approaches? I've found talk on the mailings lists surrounding future support for slaves following timelines after a new master completes recovery, but I have been unable to find anything discussing the approach used by repmgr. Are there risks associated with the `pg_ctl restart` approach, or is it safe to use?
Cheers,
Dave
On Thu, Jun 21, 2012 at 10:10 AM, David Pirotte <dpirotte@gmail.com> wrote: > > 2ndQuadrant's repmgr uses the second option so that the async slave can > "follow" the new master, saving you from having to do a new base backup. > Additionally, the old master is able to start streaming replication from the > new master without a new base backup. (Repmgr does not actually support the > latter behavior out of the box, but it seemed to work.) > is not safe to make old master to start SR from new master without any additional action. if the old master crashed/disconnected before some info was sent to the slave, then the old master has info not in the slave so when it converts in new master that piece of info is lost... if now the old master tries to connect to the new master he will except that info to exists... > So, given a hard failure (i.e. power loss) of the master, `pg_ctl promote` > provides availability more quickly, but `pg_ctl restart` provides data > redundancy more quickly. Is this an accurate assessment of the tradeoffs > between the two approaches? yes, i think that's pretty much the difference > Are there risks associated with the `pg_ctl > restart` approach, or is it safe to use? > it's safe as long as you let repmgr do it ;) -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación
On 21 June 2012 16:10, David Pirotte <dpirotte@gmail.com> wrote: > So, given a hard failure (i.e. power loss) of the master, `pg_ctl promote` > provides availability more quickly, but `pg_ctl restart` provides data > redundancy more quickly. Not sure where this idea of "more quickly" comes from. Can you explain? > Are there risks associated with the `pg_ctl > restart` approach, or is it safe to use? PostgreSQL supports both, why do you mention just one of them as a potential risk? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services