pg_rewind rewinded too much... - Mailing list pgsql-general

From marin@kset.org
Subject pg_rewind rewinded too much...
Date
Msg-id 51c64a8dd6db51fe8ac16afd66341f23@kset.org
Whole thread Raw
Responses Re: pg_rewind rewinded too much...  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-general
Hi,

I am doing some testing with pg_rewind and I encountered a very strange
situation. So, let me describe the scenario:

1. Create a master (server1) node (max_wal_size = 2GB,
checkpoint_timeout = 10min)
2. Create a test database on it (server1)
3. Create a test table (server1)
4. Insert a couple of million rows ( >> 5GB) but cancel the query at
some point leaving a few gigs of useless WAL files (server1)
5. Insert a couple of thousand rows ( < 1GB) (server1)
6. Build a slave (server2) and connect to the master (server1)
7. Promote the slave (server2 now master)
8. Insert one row on the old master (server1)
9. Shutdown the old master (server1)
10. Issue a pg_rewind on the old master (server1)
11. Configure recovery.conf and start the new slave (server1)

All of this under 10 minutes :)

The timeline switch is at:
1       1/B2A64F80      no recovery target specified

The pg_controldata on the slave (server1) gives:
Latest checkpoint location:           1/B3000108
Prior checkpoint location:            1/B3000060
Latest checkpoint's REDO location:    1/B30000D0
Latest checkpoint's REDO WAL file:    0000000200000001000000B3

But when starting the slave i get the following errors in the log file:

2016-11-28 15:38:44 UTC: WAL 00000002.history successfully restored
2016-11-28 15:38:43 UTC [6]: [14-1] user=,db= LOG:  restored log file
"00000002.history" from archive
2016-11-28 15:38:44 UTC: Error copying WAL 00000002000000010000009D!
2016-11-28 15:38:44 UTC: Error copying WAL 00000001000000010000009D!
2016-11-28 15:38:44 UTC [494]: [1-1] user=,db= LOG:  started streaming
WAL from primary at 1/9D000000 on timeline 2
2016-11-28 15:38:44 UTC [494]: [2-1] user=,db= FATAL:  could not receive
data from WAL stream: ERROR:  requested WAL segment
00000002000000010000009D has already been removed

So the problem is that the slave (server1) went back in time, to a
location that is not on the new master (first WAL available on the new
master is 0000000100000001000000AE). The old archive on the slave
(server1) is gone (deleted) so there is no way to fix this. And this is
not a problem per se, because you can use pg_basebackup or rsync instead
of pg_rewind. The real problem is that I can't diagnose this problem
until I start the server process and read the logs. Shouldn't
pg_controldata report the first WAL needed to start the instance?

Regards,
Mladen Marinović



pgsql-general by date:

Previous
From: Achilleas Mantzios
Date:
Subject: Re: pg_dump system catalog
Next
From: rob
Date:
Subject: Re: Invoice Table Design