Failover Testing Failures: invalid resource manager ID in primary checkpoint record - Mailing list pgsql-admin

From Don Seiler
Subject Failover Testing Failures: invalid resource manager ID in primary checkpoint record
Date
Msg-id CAHJZqBAEx_rxuApJaBX7g9i9yrz8vVjvZAPG+P9=XgYuzrrgAA@mail.gmail.com
Whole thread Raw
Responses Re: Failover Testing Failures: invalid resource manager ID in primary checkpoint record
List pgsql-admin
PostgreSQL 12.13 (PGDG packages) in a streaming replication configuration. pgBackrest 2.43 used for WAL archiving and DB backups to cloud storage

I'm testing and documenting a DR exercise process where I:
  1. Cleanly shutdown PG on the primary
  2. Promote the PG DR replica
  3. Place the standby.signal file on the old primary and start it up (presumes no other configurations need changing, primary_conninfo etc were already set).
My hope is I could just start the old primary / new replica if it was cleanly shutdown prior to promoting the replica. However when I try to start up that new replica, I'm met with:

LOG:  restored log file "00000002000000B70000005A" from archive
LOG:  invalid resource manager ID in primary checkpoint record
PANIC:  could not locate a valid checkpoint record
LOG:  startup process (PID 17660) was terminated by signal 6: Aborted
LOG:  aborting startup due to startup process failure
LOG:  database system is shut down


It doesn't appear any WAL files are missing as it finds all the files that it asks for. Am I missing a piece here?

My hope is to avoid having to do a restore to rebuild the new replica.

Aside for those that may be asking: most of these databases do not have data checksums enabled so pg_rewind isn't in the picture. Although I'm reading now that we could enable the wal_log_hints parameter as an alternative. I'm leery of the overhead but if it's the same overhead that would be done with data checksums then I guess there would be nothing lost when we eventually enable them.

--
Don Seiler
www.seiler.us

pgsql-admin by date:

Previous
From: Tom Lane
Date:
Subject: Re: \dconfig in PostgreSQL 14
Next
From: Laurenz Albe
Date:
Subject: Re: Failover Testing Failures: invalid resource manager ID in primary checkpoint record