Re: pg_rewind success even though getting error 'record withincorrect prev-link' - Mailing list pgsql-general

From Ron
Subject Re: pg_rewind success even though getting error 'record withincorrect prev-link'
Date
Msg-id f8844603-e772-2367-35c3-a1099a7dab7e@gmail.com
Whole thread Raw
In response to Re: pg_rewind success even though getting error 'record withincorrect prev-link'  (Abdullah Al Maruf <maruf.2hin@gmail.com>)
Responses Re: pg_rewind success even though getting error 'record withincorrect prev-link'  (Abdullah Al Maruf <maruf.2hin@gmail.com>)
List pgsql-general
On 1/29/19 9:57 PM, Abdullah Al Maruf wrote:
Hi Michael

> This is pointing out to the end of WAL for the current timeline.  You
> may face it after reading a WAL segment in an area which has been used
> in the past for a recycled segment.

Are you talking about error ` LOG:  invalid record length at 0/B000098: wanted 24, got 0` ? 
or,
 `LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098`

Actually, the 1st error is not making any issue. This node starts to streaming from primary successfully. 
But when the second error comes, It appears every 5 seconds. And, the node is not streaming from master. 

pg_rewind still resolves timeline conflict, but it's not fixing this second error.

Any work around?? 
----------------
My scenario, in short, I have 1 master nodes (0th node) and three standby nodes (1st,

2nd & 3rd node). When I make the 3rd node as master (by trigger file) and restarts 0th node as a replica, It shows no problem.

But when both nodes are offline and our leader selection chooses the 0th node as a master, and tries to reattach the 3rd node as Replica, It throws an error similar to:

``` LOG: invalid record length at 0/B000098: wanted 24, got 0 LOG: started streaming WAL from primary at 0/B000000 on timeline 2 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 FATAL: terminating walreceiver process due to administrator command LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098 LOG: record with incorrect prev-link 10000/21B0000 at 0/B000098


The only error I see is when you apparently manually kill the process.  The LOG messages aren't actually errors.

--
Angular momentum makes the world go 'round.

pgsql-general by date:

Previous
From: Ron
Date:
Subject: Re: Querying w/ join slow for large/many child tables
Next
From: Abdullah Al Maruf
Date:
Subject: Re: pg_rewind success even though getting error 'record withincorrect prev-link'