invalid record length at XX: wanted 24, got - Mailing list pgsql-admin

From Mariel Cherkassky
Subject invalid record length at XX: wanted 24, got
Date
Msg-id CA+t6e1nJMKsA2_sUeUqRSNhds4utanCpKSqcH6C+U=M44CMF-A@mail.gmail.com
Whole thread Raw
Responses Re: invalid record length at XX: wanted 24, got  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-admin
Hey,
I have 2 db nodes(9.6) configured with streaming replication (+repmgr). Suddenly ysterday my secondary stopped syncing and I saw the following error in the log : 
invalid record length at X/YYYYY: wanted 24, got

In addition, since then, the secondary db keeps restoring the same wal file (kinda stuck on restorying it).
I guess that the wal was missing some data / corrupted so I tried to copy it from the primary but it didnt help. In addition, I decided to start the secondary in read write but it failed with the following error : 
LOG:  invalid primary checkpoint record
LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record
LOG:  startup process (PID 17096) was terminated by signal 6: Aborted
LOG:  aborting startup due to startup process failure
LOG:  database system is shut down

My next idea is using pg_resetxlog in order to start the secondary successfully and then use pg_rewind to sync it again with the master. The master is working perfectly and there arent any issues on it. Right now, I'm not interested in taking a basebackup and creating the secondary from scratch..

I will be happy to hear if u guys have any other ideas why it might happened and how I can handle it.

Thanks

pgsql-admin by date:

Previous
From: Shital A
Date:
Subject: Re: Pgsql resource agent of pacemaker
Next
From: Jeff Janes
Date:
Subject: Re: invalid record length at XX: wanted 24, got