Re: pg_rewind, a tool for resynchronizing an old master after failover - Mailing list pgsql-hackers

From Thom Brown
Subject Re: pg_rewind, a tool for resynchronizing an old master after failover
Date
Msg-id CAA-aLv6cd-JpN2qkBotcKkMiFuab-xjMDRBpoQ26ySys4X1SnQ@mail.gmail.com
Whole thread Raw
In response to pg_rewind, a tool for resynchronizing an old master after failover  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: pg_rewind, a tool for resynchronizing an old master after failover  (Thom Brown <thom@linux.com>)
List pgsql-hackers
On 23 May 2013 07:10, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> Hi,
>
> I've been hacking on a tool to allow resynchronizing an old master server
> after failover. The need to do a full backup/restore has been a common
> complaint ever since we've had streaming replication. I saw on the wiki that
> this was discussed in the dev meeting; too bad I couldn't make it.
>
> In a nutshell, the idea is to do copy everything that has changed between
> the cluster, like rsync does, but instead of reading through all files, use
> the WAL to determine what has changed. Here's a somewhat more detailed
> explanation, from the README:
>
> Theory of operation
> -------------------
>
> The basic idea is to copy everything from the new cluster to old, except for
> the blocks that we know to be the same.
>
> 1. Scan the WAL log of the old cluster, starting from the point where
> the new cluster's timeline history forked off from the old cluster. For each
> WAL record, make a note of the data blocks that are touched. This yields a
> list of all the data blocks that were changed in the old cluster, after the
> new cluster forked off.
>
> 2. Copy all those changed blocks from the new master to the old master.
>
> 3. Copy all other files like clog, conf files etc. from the new cluster
> to old. Everything except the relation files.
>
> 4. Apply the WAL from the new master, starting from the checkpoint
> created at failover. (pg_rewind doesn't actually apply the WAL, it just
> creates a backup label file indicating that when PostgreSQL is started, it
> will start replay from that checkpoint and apply all the required WAL)
>
>
> Please take a look: https://github.com/vmware/pg_rewind

6 instances set up:

[Primary (5530)]   |   ---[Standby 1 (5531)]           |           ---[Standby 2 (5532)]           |
---[Standby3 (5533)]           |           ---[Standby 4 (5534)]           |           ---[Standby 5 (5535)]
 

1) Created a table on the primary with some data.
2) Promoted Standby 1
3) Cleanly shut down Primary
4) pg_rewind --target-pgdata=/tmp/primary
--source-server='host=localhost port=5531 dbname=postgres'

Last common WAL position: 0/30227F8 on timeline 1
Last common checkpoint at 0/30227F8 on timeline 1
error reading xlog record: record with zero length at 0/3022860
Done!

Contents of pg_xlog directory in Primary and Standby 1:
thom@swift /tmp $ ls -l primary/pg_xlog/
total 49156
-rw------- 1 thom users 16777216 May 23 09:52 000000010000000000000002
-rw------- 1 thom users 16777216 May 23 09:52 000000010000000000000003
-rw------- 1 thom users 16777216 May 23 09:52 000000020000000000000003
-rw------- 1 thom users       41 May 23 09:52 00000002.history
drwx------ 2 thom users       80 May 23 09:52 archive_status
thom@swift /tmp $ ls -l standby1/pg_xlog/
total 49156
-rw------- 1 thom users 16777216 May 23 09:49 000000010000000000000002
-rw------- 1 thom users 16777216 May 23 09:50 000000010000000000000003
-rw------- 1 thom users 16777216 May 23 09:52 000000020000000000000003
-rw------- 1 thom users       41 May 23 09:50 00000002.history
drwx------ 2 thom users       80 May 23 09:50 archive_status

5) Changed recovery.done in primary to point its primary_conninfo port
to 5531 (that of Standby 1).
6) Renamed it to .conf.
7) Changed postgresql.conf to set the port back to its original one
(as pg_rewind has caused it to match that of Standby 1)
8) Start Primary

Latest log in primary reads:

LOG:  database system was interrupted; last known up at 2013-05-23 09:50:34 EDT
LOG:  entering standby mode
LOG:  invalid xl_info in checkpoint record
FATAL:  could not locate required checkpoint record
HINT:  If you are not restoring from a backup, try removing the file
"/tmp/primary/backup_label".
LOG:  startup process (PID 31503) exited with exit code 1
LOG:  aborting startup due to startup process failure

9) Okay, so I'll delete that label and try again.  Now all is well:

LOG:  database system was interrupted; last known up at 2013-05-23 09:50:34 EDT
LOG:  entering standby mode
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at 0/3022828
LOG:  record with zero length at 0/3041A60
LOG:  consistent recovery state reached at 0/3041A60
LOG:  database system is ready to accept read only connections
LOG:  started streaming WAL from primary at 0/3000000 on timeline 2

10) Connect to Standby 1 and insert more rows into our original table.
11) Connect to Primary and those rows are appearing.

--
Thom



pgsql-hackers by date:

Previous
From: German Becker
Date:
Subject: Re: WAL segments (names) not in a sequence
Next
From: Chris Farmiloe
Date:
Subject: ASYNC Privileges proposal