Re: Failback to old master - Mailing list pgsql-hackers

From didier
Subject Re: Failback to old master
Date
Msg-id CAJRYxuJ7nBoQa3B_vW_8PwLx40VJn1tLKNPzMX6SrCKLXt=YpA@mail.gmail.com
Whole thread Raw
In response to Re: Failback to old master  ("Maeldron T." <maeldron@gmail.com>)
Responses Re: Failback to old master  ("Maeldron T." <maeldron@gmail.com>)
List pgsql-hackers
Hi,


On Sat, Nov 15, 2014 at 5:31 PM, Maeldron T. <maeldron@gmail.com> wrote:
>> A safely shut down master (-m fast is safe) can be safely restarted as
>> a slave to the newly promoted master. Fast shutdown shuts down all
>> normal connections, does a shutdown checkpoint and then waits for this
>> checkpoint to be replicated to all active streaming clients. Promoting
>> slave to master creates a timeline switch, that prior to version 9.3
>> was only possible to replicate using the archive mechanism. As of
>> version 9.3 you don't need to configure archiving to follow timeline
>> switches, just add a recovery.conf to the old master to start it up as
>> a slave and it will fetch everything it needs from the new master.
>>
> I took your advice and I understood that removing the recovery.conf followed
> by a restart is wrong. I will not do that on my production servers.
>
> However, I can't make it work with promotion. What did I wrong? It was
> 9.4beta3.
>
> mkdir 1
> mkdir 2
> initdb -D 1/
> <edit config: change port, wal_level to hot_standby, hot_standby to on,
> max_wal_senders=7, wal_keep_segments=100, uncomment replication in hba.conf>
> pg_ctl -D 1/ start
> createdb -p 5433
> psql -p 5433
> pg_basebackup -p 5433 -R -D 2/
> mcedit 2/postgresql.conf <change port>
> chmod -R 700 1
> chmod -R 700 2
> pg_ctl -D 2/ start
> psql -p 5433
> psql -p 5434
> <everything works>
> pg_ctl -D 1/ stop
> pg_ctl -D 2/ promote
> psql -p 5434
> cp 2/recovery.done 1/recovery.conf
> mcedit 1/recovery.conf <change port>
> pg_ctl -D 1/ start
>
> LOG:  replication terminated by primary server
> DETAIL:  End of WAL reached on timeline 1 at 0/3000AE0.
> LOG:  restarted WAL streaming at 0/3000000 on timeline 1
> LOG:  replication terminated by primary server
> DETAIL:  End of WAL reached on timeline 1 at 0/3000AE0.
>
> This is what I experienced in the past when I tried with promote. The old
> master disconnects from the new. What am I missing?
>
I think you have to add
recovery_target_timeline = '2'
in recovery.conf
with '2' being the new primary timeline .
cf http://www.postgresql.org/docs/9.4/static/recovery-target-settings.html

Didier



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Review of Refactoring code for sync node detection
Next
From: Robert Haas
Date:
Subject: Re: alternative model for handling locking in parallel groups