Thread: standby server is complaining about missing wal file after switchover

standby server is complaining about missing wal file after switchover

From
Dhirendra Singh
Date:
Hi,
My postgres version is 14.2
I have setup streaming replication between a primary and a standby using replication slot and without archive.
I have created replication slot "slot1" on the primary.
I did switchover following below steps.
1. shutdown the primary server.
2. created replicated slot "slot1" on standby and promoted it to primary.
3. run pg_rewind on the old primary and made it standby.
4. started the old primary as new standby.

I got following error in the new standby. it is complaining about missing wal file on the new primary.

2022-10-13 12:58:48.497 UTC [25] LOG:  fetching timeline history file for timeline 2 from primary server
2022-10-13 12:58:48.508 UTC [25] LOG:  started streaming WAL from primary at 0/B000000 on timeline 1
2022-10-13 12:58:48.508 UTC [25] FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 00000002000000000000000B has already been removed
2022-10-13 12:58:48.509 UTC [21] LOG:  new target timeline is 2
2022-10-13 12:58:48.580 UTC [28] LOG:  started streaming WAL from primary at 0/B000000 on timeline 2
2022-10-13 12:58:48.580 UTC [28] FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 00000002000000000000000B has already been removed

why the new primary has removed the wal file even though i created replication slot on it before promoting.
value of max_slot_wal_keep_size is -1 and wal_keep_size is 0

Thanks,
Dhirendra.

Re: standby server is complaining about missing wal file after switchover

From
Jerry Sievers
Date:
Dhirendra Singh <dhirendraks@gmail.com> writes:

> Hi,
> My postgres version is 14.2
> I have setup streaming replication between a primary and a standby using replication slot and without
> archive.
> I have created replication slot "slot1" on the primary.
> I did switchover following below steps.
> 1. shutdown the primary server.
> 2. created replicated slot "slot1" on standby and promoted it to primary.
> 3. run pg_rewind on the old primary and made it standby.
> 4. started the old primary as new standby.
>
> I got following error in the new standby. it is complaining about missing wal file on the new primary.
>
> 2022-10-13 12:58:48.497 UTC [25] LOG:  fetching timeline history file for timeline 2 from primary
> server
> 2022-10-13 12:58:48.508 UTC [25] LOG:  started streaming WAL from primary at 0/B000000 on
> timeline 1
> 2022-10-13 12:58:48.508 UTC [25] FATAL:  could not receive data from WAL stream: ERROR: 
> requested WAL segment 00000002000000000000000B has already been removed
> 2022-10-13 12:58:48.509 UTC [21] LOG:  new target timeline is 2
> 2022-10-13 12:58:48.580 UTC [28] LOG:  started streaming WAL from primary at 0/B000000 on
> timeline 2
> 2022-10-13 12:58:48.580 UTC [28] FATAL:  could not receive data from WAL stream: ERROR: 
> requested WAL segment 00000002000000000000000B has already been removed
>
> why the new primary has removed the wal file even though i created replication slot on it before
> promoting.
> value of max_slot_wal_keep_size is -1 and wal_keep_size is 0

Dunno but did you call the pg_create_physical_replication_slot()
 function with arg2=true to have it immediately keep the WAL pointer?

I believe otherwise not until the slot has been connected to does it
retain anything.

HTH

>
> Thanks,
> Dhirendra.



Re: standby server is complaining about missing wal file after switchover

From
Dhirendra Singh
Date:
Thanks. This worked.

On Fri, Oct 14, 2022 at 5:12 AM Jerry Sievers <gsievers19@comcast.net> wrote:
Dhirendra Singh <dhirendraks@gmail.com> writes:

> Hi,
> My postgres version is 14.2
> I have setup streaming replication between a primary and a standby using replication slot and without
> archive.
> I have created replication slot "slot1" on the primary.
> I did switchover following below steps.
> 1. shutdown the primary server.
> 2. created replicated slot "slot1" on standby and promoted it to primary.
> 3. run pg_rewind on the old primary and made it standby.
> 4. started the old primary as new standby.
>
> I got following error in the new standby. it is complaining about missing wal file on the new primary.
>
> 2022-10-13 12:58:48.497 UTC [25] LOG:  fetching timeline history file for timeline 2 from primary
> server
> 2022-10-13 12:58:48.508 UTC [25] LOG:  started streaming WAL from primary at 0/B000000 on
> timeline 1
> 2022-10-13 12:58:48.508 UTC [25] FATAL:  could not receive data from WAL stream: ERROR:
> requested WAL segment 00000002000000000000000B has already been removed
> 2022-10-13 12:58:48.509 UTC [21] LOG:  new target timeline is 2
> 2022-10-13 12:58:48.580 UTC [28] LOG:  started streaming WAL from primary at 0/B000000 on
> timeline 2
> 2022-10-13 12:58:48.580 UTC [28] FATAL:  could not receive data from WAL stream: ERROR:
> requested WAL segment 00000002000000000000000B has already been removed
>
> why the new primary has removed the wal file even though i created replication slot on it before
> promoting.
> value of max_slot_wal_keep_size is -1 and wal_keep_size is 0

Dunno but did you call the pg_create_physical_replication_slot()
 function with arg2=true to have it immediately keep the WAL pointer?

I believe otherwise not until the slot has been connected to does it
retain anything.

HTH

>
> Thanks,
> Dhirendra.