Error: record with incorrect prev-link ---/--- at ---/---, whenarchiving is 'on' - Mailing list pgsql-general

From Abdullah Al Maruf
Subject Error: record with incorrect prev-link ---/--- at ---/---, whenarchiving is 'on'
Date
Msg-id CANzStTChLXsrX5Y1KBJSfiFrJcb_f=9mx-QPj36tj7XJm2Pc_A@mail.gmail.com
Whole thread Raw
Responses Re: Error: record with incorrect prev-link ---/--- at ---/---, whenarchiving is 'on'  (Abdullah Al Maruf <maruf.2hin@gmail.com>)
List pgsql-general
Hello, 
I am trying to build an automated system in docker/kubernetes where a container/pod will automatically schedule itself as a Master or Standby. 

In short, I have 1 master nodes (0th node) and three standby nodes (1st, 2nd & 3rd node). When I make the 3rd node as master (by trigger file) and restarts 0th node as a replica, It shows no problem. 

But when, both nodes are offline and our leader selection chooses 0th node as a master, and tries to reattach the 3rd node as Replica, It throws an error similar to:

```
LOG:  invalid record length at 0/B000098: wanted 24, got 0
LOG:  started streaming WAL from primary at 0/B000000 on timeline 2
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
FATAL:  terminating walreceiver process due to administrator command
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
LOG:  record with incorrect prev-link 10000/21B0000 at 0/B000098
```

If I disable archive_mode, I never faced this error with same script. It only apperas when archive is on, and also not all the times it happens but most of the time it does. 
The error message appears after every 5 seconds.

------------------------------------------------------------------------------------------------------------

Scenario In details:

I have two folders for scripts. 

├── primary
│   ├── postgresql.conf
│   ├── restore.sh
│   ├── run.sh
│   └── start.sh
└── replica
    ├── recovery.conf
    └── run.sh

I have a system that will choose the leader. If the current pod is leader, it will run `primary/run.sh`, and If it is a replica, it will run `replica/run.sh`. The problem is not related to restore.sh at this moment. So I am skipping restore.sh.







pgsql-general by date:

Previous
From: Sathish Kumar
Date:
Subject: Table Replication
Next
From: Abdullah Al Maruf
Date:
Subject: Re: Error: record with incorrect prev-link ---/--- at ---/---, whenarchiving is 'on'