[BUGS] pg_rewind fails after failover, 'invalid record length' - Mailing list pgsql-bugs

From Stuart Bishop
Subject [BUGS] pg_rewind fails after failover, 'invalid record length'
Date
Msg-id CADmi=6O_RApqN=4QWA72gAqt96p0s1+3g+=pN1xgEhVVPzt6qg@mail.gmail.com
Whole thread Raw
Responses Re: [BUGS] pg_rewind fails after failover, 'invalid record length'  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-bugs
I have a test case with 3 PostgreSQL 9.5.5 servers, one master and two
hot standbys using standard streaming replication from the master.
wal_log_hints is not enabled, but all systems initialized to use
checksums.

The system is idle. I tear down the master, leaving the two standbys
orphaned at the same point in timeline 1.

I promote one of the standbys to master, switching it to timeline 2. I
shutdown the other standby, and attempt to run pg_rewind. It fails:

$ /usr/lib/postgresql/9.5/bin/pg_rewind
--target-pgdata=/var/lib/postgresql/9.5/main
--source-server='dbname=postgres host=10.0.4.212 port=5432
user=_juju_repl'
servers diverged at WAL position 0/5000AE0 on timeline 1

could not find previous WAL record at 0/5000AE0: invalid record length
at 0/5000AE0
Failure, exiting

This is what the pg_xlog on the new master looked like at that point:

postgres@juju-4ead0d-11:~/9.5/main/pg_xlog$ ls -al
total 81993
drwx------  3 postgres postgres        9 Feb 15 08:55 .
drwx------ 19 postgres postgres       25 Feb 15 08:55 ..
-rw-------  1 postgres postgres 16777216 Feb 15 07:52 000000010000000000000002
-rw-------  1 postgres postgres 16777216 Feb 15 07:52 000000010000000000000003
-rw-------  1 postgres postgres 16777216 Feb 15 07:52 000000010000000000000004
-rw-------  1 postgres postgres 16777216 Feb 15 08:52
000000010000000000000005.partial
-rw-------  1 postgres postgres       41 Feb 15 08:55 00000002.history
-rw-------  1 postgres postgres 16777216 Feb 15 09:15 000000020000000000000005
drwx------  2 postgres postgres        6 Feb 15 08:55 archive_status

Reconfiguring the standby to replicate from the new master and
restarting it works fine. The standby happily replicates and switches
to the new timeline. I can shut this standby down and run pg_rewind
again and it works fine.


-- 
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

pgsql-bugs by date:

Previous
From: kcwitt@gmail.com
Date:
Subject: [BUGS] BUG #14546: "point" type does not work with "IS DISTINCT"
Next
From: Hari Sankar A
Date:
Subject: [BUGS] Problem with PostgreSQL string sorting