Thread: [GENERAL] pg_rewind - restore new slave failed to startup during recovery

[GENERAL] pg_rewind - restore new slave failed to startup during recovery

From

Dylan Luong

Date:

22 August 2017, 06:52:58

I have 1 master and 1 slave wal streaming replication setup and the Application connects via a load balancer (LTM) where the all connections are redirected to the master member (master db).

We have archive_mode enabled.

I am trying to test to use pg_rewind to restore the new slave (old master) after a failover while the system is under load.

Here are the steps I take to test:

1. Disable the master ltm member (all connections redired to slave member)

2. Promote slave (touch promote.me)

3. Stop the master db (old master)

4. Do pg_rewind on the new slave (old master)

5. Start the new slave.

Please see attached psql.jpg for the result from the pg_rewind.

However, when I tried to start the new slave, I am getting the error that it cannot locate the archive wal files and can not receive data from WAL stream error.

Please see attached logs.jpg.

Checking the on the new master, I see that the check point that its trying to restore is the file 000000040000009C0000006F, but the file does not exist anywhere on the new master. Not in the pg_xlog or the archive folder. (as specified in the postgresql.conf)

Please see attached psql.jpg.

Here is my recovery.conf :

standby_mode = 'on'

primary_conninfo = 'host=10.69.19.18 user=replicant’

trigger_file = '/var/run/promote_me'

restore_command = 'cp /pg_backup/backup/archive_sync/%f "%p"'

does anyone know why?

Under what conditions will pg_rewind wont’ work?

Thanks.

Regards

Dylan

Attachment

Re: [GENERAL] pg_rewind - restore new slave failed to startup during recovery

From

Michael Paquier

Date:

22 August 2017, 07:06:44

On Tue, Aug 22, 2017 at 9:52 AM, Dylan Luong <Dylan.Luong@unisa.edu.au> wrote:
> I have 1 master and 1 slave wal streaming replication setup and the
> Application connects via a load balancer (LTM) where the all connections are
> redirected to the master member (master db).
>
> We have archive_mode enabled.

First things first. What is the version of PostgreSQL involved here?

> I am trying to test to use pg_rewind to restore the new slave (old master)
> after a failover while the system is under load.

Don't worry. pg_rewind works :)

> Here are the steps I take to test:
>
> 1.       Disable the master ltm member (all connections redired to slave
> member)
> 2.       Promote slave (touch promote.me)
> 3.       Stop the master db (old master)
> 4.       Do pg_rewind on the new slave (old master)
> 5.       Start the new slave.

That flow looks correct to me. No I think that you should trigger
manually a checkpoint after step 2 on the promoted standby so as its
control file gets forcibly updated correctly with its new timeline
number. This is a small but critical point people usually miss. The
documentation of pg_rewind does not mention this point when using a
live source server, and many people have fallen into this trap up to
now... We should really mention that in the docs. What do others
think?

> Checking the on the new master, I see that the check point that its trying
> to restore is the file 000000040000009C0000006F, but the file does not exist
> anywhere on the new master. Not in the pg_xlog or the archive folder. (as
> specified in the postgresql.conf)

4 is the number of the last timeline the promoted standby has been using, right?

> Please see attached  psql.jpg.
>
> Here is my recovery.conf :
> standby_mode = 'on'
> primary_conninfo = 'host=10.69.19.18  user=replicant’
> trigger_file = '/var/run/promote_me'
> restore_command = 'cp /pg_backup/backup/archive_sync/%f "%p"'
>
> does anyone know why?

What are the contents of /pg_backup/backup/archive_sync/? Are you sure
that the promoted standby has archived correctly the first segment of
its new timeline for example?

> Under what conditions will pg_rewind wont’ work?

Only one WAL segment missing would prevent any base backup or rewound
node to reach a consistent point. You need to be careful about the
contents of your archives. Now a failover done correctly is a tricky
thing, which could likely fail if you don't issue a checkpoint
immediately on the promoted standby if pg_rewind is kicked in the
process before an automatic checkpoint happens (because of timeout or
volume, whichever).
--
Michael

Re: [GENERAL] pg_rewind - restore new slave failed to startup duringrecovery

From

Dylan Luong

Date:

22 August 2017, 10:11:49

Thanks Michael.

> First things first. What is the version of PostgreSQL involved here?

The PostgreSQL is version 9.6.

>4 is the number of the last timeline the promoted standby has been using, right?

The history file in pg_xlog, is dated at the time of promotion on the standby (current master)
-rw-------. 1 postgres postgres      131 Aug 21 13:26 00000004.history

$ more 00000004.history
1       20/5C000098     no recovery target specified

2       76/F8000098     no recovery target specified

3       9C/7CC50680     no recovery target specified

> What are the contents of /pg_backup/backup/archive_sync/?

The archive folder is /pg_backup/backup/archive, I ftp'ed all the contents from the /pg_backup/backup/archive folder
fromthe new  master to the /pg_backup/backup/archive_sync on the new slave.

-----Original Message-----
From: Michael Paquier [mailto:michael.paquier@gmail.com] 
Sent: Tuesday, 22 August 2017 10:37 AM
To: Dylan Luong <Dylan.Luong@unisa.edu.au>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] pg_rewind - restore new slave failed to startup during recovery

On Tue, Aug 22, 2017 at 9:52 AM, Dylan Luong <Dylan.Luong@unisa.edu.au> wrote:
> I have 1 master and 1 slave wal streaming replication setup and the 
> Application connects via a load balancer (LTM) where the all 
> connections are redirected to the master member (master db).
>
> We have archive_mode enabled.

First things first. What is the version of PostgreSQL involved here?

> I am trying to test to use pg_rewind to restore the new slave (old 
> master) after a failover while the system is under load.

Don't worry. pg_rewind works :)

> Here are the steps I take to test:
>
> 1.       Disable the master ltm member (all connections redired to slave
> member)
> 2.       Promote slave (touch promote.me)
> 3.       Stop the master db (old master)
> 4.       Do pg_rewind on the new slave (old master)
> 5.       Start the new slave.

That flow looks correct to me. No I think that you should trigger manually a checkpoint after step 2 on the promoted
standbyso as its control file gets forcibly updated correctly with its new timeline number. This is a small but
criticalpoint people usually miss. The documentation of pg_rewind does not mention this point when using a live source
server,and many people have fallen into this trap up to now... We should really mention that in the docs. What do
othersthink?

> Checking the on the new master, I see that the check point that its 
> trying to restore is the file 000000040000009C0000006F, but the file 
> does not exist anywhere on the new master. Not in the pg_xlog or the 
> archive folder. (as specified in the postgresql.conf)

4 is the number of the last timeline the promoted standby has been using, right?

> Please see attached  psql.jpg.
>
> Here is my recovery.conf :
> standby_mode = 'on'
> primary_conninfo = 'host=10.69.19.18  user=replicant’
> trigger_file = '/var/run/promote_me'
> restore_command = 'cp /pg_backup/backup/archive_sync/%f "%p"'
>
> does anyone know why?

What are the contents of /pg_backup/backup/archive_sync/? Are you sure that the promoted standby has archived correctly
thefirst segment of its new timeline for example?

> Under what conditions will pg_rewind wont’ work?

Only one WAL segment missing would prevent any base backup or rewound node to reach a consistent point. You need to be
carefulabout the contents of your archives. Now a failover done correctly is a tricky thing, which could likely fail if
youdon't issue a checkpoint immediately on the promoted standby if pg_rewind is kicked in the process before an
automaticcheckpoint happens (because of timeout or volume, whichever).

--
Michael

Re: [GENERAL] pg_rewind - restore new slave failed to startup during recovery

From

Magnus Hagander

Date:

22 August 2017, 20:39:13

On Tue, Aug 22, 2017 at 3:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote:

On Tue, Aug 22, 2017 at 9:52 AM, Dylan Luong <Dylan.Luong@unisa.edu.au> wrote:
> 1. Disable the master ltm member (all connections redired to slave
> member)
> 2. Promote slave (touch promote.me)
> 3. Stop the master db (old master)
> 4. Do pg_rewind on the new slave (old master)
> 5. Start the new slave.

That flow looks correct to me. No I think that you should trigger
manually a checkpoint after step 2 on the promoted standby so as its
control file gets forcibly updated correctly with its new timeline
number. This is a small but critical point people usually miss. The
documentation of pg_rewind does not mention this point when using a
live source server, and many people have fallen into this trap up to
now... We should really mention that in the docs. What do others
think?

If the documentation is missing such a clearly critical step, then I would say that's a definite documentation bug and it needs to be fixed. We can't really fault people for missing a small detail if we didn't document the small detail...

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: [GENERAL] pg_rewind - restore new slave failed to startup during recovery

From

Michael Paquier

Date:

23 August 2017, 07:53:08

On Tue, Aug 22, 2017 at 11:39 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Aug 22, 2017 at 3:06 AM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>> That flow looks correct to me. No I think that you should trigger
>> manually a checkpoint after step 2 on the promoted standby so as its
>> control file gets forcibly updated correctly with its new timeline
>> number. This is a small but critical point people usually miss. The
>> documentation of pg_rewind does not mention this point when using a
>> live source server, and many people have fallen into this trap up to
>> now... We should really mention that in the docs. What do others
>> think?
>
> If the documentation is missing such a clearly critical step, then I would
> say that's a definite documentation bug and it needs to be fixed. We can't
> really fault people for missing a small detail if we didn't document the
> small detail...

What do you think about the attached? I would recommend a back-patch
down to 9.5 to get the documentation right everywhere but I think as
well that this may not be enough. We could document as well an example
of a full-fledged failover flow in the Notes, in short:
1) Promote a standby.
2) Stop the old master cleanly. If it has been killed atrociously,
make it finish recovery once and then stop it so as its WAL data is
ahead of the point WAL has fork after the promotion (shutdown
checkpoint record is at least here).
3) Prepare source server for the rewind.
3-1) Using file copy, stop the source server (promoted standby) cleanly first.
3-2) Using SQL, issue a checkpoint on the source server to update its
control file and making sure that the timeline number is up-do-date on
disk.
4) Perform the actual rewind. This will need WAL segments on the
target from the point WAL has forked to the shutdown checkpoint record
created at step 2).
5) Create recovery.conf on the target and point it to the source for
streaming, or archives. Then let it perform recovery.
--
Michael

Attachment

rewind-checkpoint-doc.patch