Thread: Switchover of Master and Slave roles

Switchover of Master and Slave roles

From
A J
Date:
Hello,
I am trying to switch the master and slave roles in a test I am doing with streaming replication in 9.1 beta.

To start with, I have one master (N1 node) and one slave (N2 node).

I stop N1 and promote N2 as primary (by touching the trigger file).
Now I wish for N1 to come back up as slave of N2. How should I do that ?

I tried creating a recovery.conf in N1's directory, along with cleaning up its pg_xlog and pg_log.
But when I start N1 and try to connect to it, I get the message:
psql: FATAL:  the database system is starting up

And the log file of N1 says:
FATAL:  timeline 2 of the primary does not match recovery target timeline 1

Any ideas on what could I be missing ?

Thanks,
AJ

Re: Switchover of Master and Slave roles

From
Ray Stell
Date:
On Tue, Jun 07, 2011 at 09:32:31AM -0700, A J wrote:
> Hello,
> I am trying to switch the master and slave roles in a test I am doing with
> streaming replication in 9.1 beta.
>

http://www.postgresql.org/docs/9.1/static/warm-standby-failover.html

Once failover to the standby occurs, there is only a single server in
operation. This is known as a degenerate state. The former standby is
now the primary, but the former primary is down and might stay down. To
return to normal operation, a standby server must be recreated, either
on the former primary system when it comes up, or on a third, possibly
new, system. Once complete, the primary and standby can be considered
to have switched roles.

Re: Switchover of Master and Slave roles

From
A J
Date:
>>To return to normal operation, a standby server must be recreated, either on the former primary system when it comes up<<

What does it exactly mean to 'recreate a standby server' ? Can I not use the datafiles on the former primary and just let it sync and get the incremental from the new primary ? Do I have to remove all the data files from the former primary and get all the datafiles through rsync (or other similar manner) from the new primary ?



From: Ray Stell <stellr@cns.vt.edu>
To: A J <s5aly@yahoo.com>
Cc: pgsql-admin@postgresql.org
Sent: Tue, June 7, 2011 2:12:44 PM
Subject: Re: [ADMIN] Switchover of Master and Slave roles

On Tue, Jun 07, 2011 at 09:32:31AM -0700, A J wrote:
> Hello,
> I am trying to switch the master and slave roles in a test I am doing with
> streaming replication in 9.1 beta.
>

http://www.postgresql.org/docs/9.1/static/warm-standby-failover.html

Once failover to the standby occurs, there is only a single server in
operation. This is known as a degenerate state. The former standby is
now the primary, but the former primary is down and might stay down. To
return to normal operation, a standby server must be recreated, either
on the former primary system when it comes up, or on a third, possibly
new, system. Once complete, the primary and standby can be considered
to have switched roles.

Re: Switchover of Master and Slave roles

From
Guillaume Lelarge
Date:
On Tue, 2011-06-07 at 11:33 -0700, A J wrote:
> >>To return to normal operation, a standby server must be recreated,
> either on the former primary system when it comes up<<
>
>
> What does it exactly mean to 'recreate a standby server' ? Can I not
> use the datafiles on the former primary and just let it sync and get
> the incremental from the new primary ? Do I have to remove all the
> data files from the former primary and get all the datafiles through
> rsync (or other similar manner) from the new primary ?
>
>

Don't remove them. This way, rsync will only send the changed files.



--
Guillaume
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com


Re: Switchover of Master and Slave roles

From
"Kevin Grittner"
Date:
A J <s5aly@yahoo.com> wrote:

> What does it exactly mean to 'recreate a standby server' ? Can I
> not use the datafiles on the former primary and just let it sync
> and get the incremental from the new primary ? Do I have to remove
> all the data files from the former primary and get all the
> datafiles through rsync (or other similar manner) from the new
> primary ?

The old server will probably be close enough that judicious use of
rsync with the old copy as the target will allow a base backup to be
taken very quickly.

-Kevin

Re: Switchover of Master and Slave roles

From
A J
Date:
Ok. So if I understand it correctly, as far as Postgres is concerned the 'mirror is broken'. It is a one-time cutover.
We then rely on filesystem tools (or other third party tools) to get the original master in sync with the new master efficiently and then make it join as slave.
Right ?


From: Kevin Grittner <Kevin.Grittner@wicourts.gov>
To: Ray Stell <stellr@cns.vt.edu>; A J <s5aly@yahoo.com>
Cc: pgsql-admin@postgresql.org
Sent: Tue, June 7, 2011 2:44:41 PM
Subject: Re: [ADMIN] Switchover of Master and Slave roles

A J <s5aly@yahoo.com> wrote:

> What does it exactly mean to 'recreate a standby server' ? Can I
> not use the datafiles on the former primary and just let it sync
> and get the incremental from the new primary ? Do I have to remove
> all the data files from the former primary and get all the
> datafiles through rsync (or other similar manner) from the new
> primary ?

The old server will probably be close enough that judicious use of
rsync with the old copy as the target will allow a base backup to be
taken very quickly.

-Kevin

Re: Switchover of Master and Slave roles

From
"Kevin Grittner"
Date:
A J <s5aly@yahoo.com> wrote:

> Ok. So if I understand it correctly, as far as Postgres is
> concerned the 'mirror is broken'. It is a one-time cutover.
> We then rely on filesystem tools (or other third party tools) to
> get the original master in sync with the new master efficiently
> and then make it join as slave. Right ?

Right.  There's been talk of building in something, but it would
take a lot of work to get something which would be as fast as rsync.
So far nobody has stepped up offering to re-invent that wheel.

-Kevin

Re: Switchover of Master and Slave roles

From
Ray Stell
Date:
On Tue, Jun 07, 2011 at 11:59:07AM -0700, A J wrote:
> We then rely on filesystem tools (or other third party tools) to get the
> original master in sync with the new master efficiently and then make it join as
> slave.

the doc would not be corrupted by an additional few words that stated as such:

"To return to normal operation, a standby server must be recreated using
external tools such as rsync, either on the former primary system when
it comes up, or on a third, possibly new, system."

Re: Switchover of Master and Slave roles

From
Simon Riggs
Date:
On Tue, Jun 7, 2011 at 8:41 PM, Ray Stell <stellr@cns.vt.edu> wrote:
> On Tue, Jun 07, 2011 at 11:59:07AM -0700, A J wrote:
>> We then rely on filesystem tools (or other third party tools) to get the
>> original master in sync with the new master efficiently and then make it join as
>> slave.
>
> the doc would not be corrupted by an additional few words that stated as such:
>
> "To return to normal operation, a standby server must be recreated using
> external tools such as rsync, either on the former primary system when
> it comes up, or on a third, possibly new, system."

You need repmgr: http://www.repmgr.org/

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Switchover of Master and Slave roles

From
Jerry Sievers
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:

> On Tue, Jun 7, 2011 at 8:41 PM, Ray Stell <stellr@cns.vt.edu> wrote:
>
>> On Tue, Jun 07, 2011 at 11:59:07AM -0700, A J wrote:
>>> We then rely on filesystem tools (or other third party tools) to get the
>>> original master in sync with the new master efficiently and then make it join as
>>> slave.
>>
>> the doc would not be corrupted by an additional few words that stated as such:
>>
>> "To return to normal operation, a standby server must be recreated using
>> external tools such as rsync, either on the former primary system when
>> it comes up, or on a third, possibly new, system."
>
> You need repmgr: http://www.repmgr.org/

Well, I am not sure on the 9.x versions but, we used to do reversible
master/standby with log shipping on the 8.x versions quite routinely,
sometimes reshaping clusters of multiple standbys., designating a new
master, restarting the old master in recovery mode and repointing the
remaining standbys..

The procedure was similar to...

apps down and locked out
pg_rotate_xlog() on master
verify final log received on standbys and consumed
bring up new master
kill immediate on old master and remaining standbys
configure old master as standby w/recovery.conf and point to new xlog
    repository
repoint other standbys to new repository and restart

There was no need to reinitialize from scratch or rsync the old master
or other standbys.

Somewhere between 8.2 and 8.4 there was a change that would require
you to increment the recovery target timeline during this operation.
It took me a bit of head scratching to get it right.

This capability was apparently little known and used, perhaps due to
uncertainty that it was viable, given the undocumented nature of
this.

Disclaimer: I do not know if this still works on 9.x using either log
shipping or streaming replication.

I covered this strategy briefly during my talk at the 2010 Pg-East
conference in Philly.

YMMV



> --
>  Simon Riggs                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: gsievers19@comcast.net
p: 305.321.1144

Re: Switchover of Master and Slave roles

From
Fujii Masao
Date:
On Wed, Jun 8, 2011 at 3:59 AM, A J <s5aly@yahoo.com> wrote:
> Ok. So if I understand it correctly, as far as Postgres is concerned the
> 'mirror is broken'. It is a one-time cutover.
> We then rely on filesystem tools (or other third party tools) to get the
> original master in sync with the new master efficiently and then make it
> join as slave.

If the standby is completely sync with the master when failover happens,
and if those two servers share the archive area, you can start new standby
without taking a fresh base backup. You can do that by setting
recovery_target_timeline to 'latest' in the recovery.conf and just starting
new standby.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center