On Fri, Aug 26, 2022 at 11:59 AM Imseih (AWS), Sami <simseih@amazon.com> wrote:
> > I agree. Testing StandbyMode here seems bogus. I thought initially
> > that the test should perhaps be for InArchiveRecovery rather than
> > ArchiveRecoveryRequested, but I see that the code which switches to a
> > new timeline cares about ArchiveRecoveryRequested, so I think that is
> > the correct thing to test here as well.
>
> > Concretely, I propose the following patch.
>
> This patch looks similar to the change suggested in
> https://www.postgresql.org/message-id/FB0DEA0B-E14E-43A0-811F-C1AE93D00FF3%40amazon.com
> to deal with panics after promoting a standby.
>
> The difference is the patch tests !ArchiveRecoveryRequested instead
> of !StandbyModeRequested as proposed in the mentioned thread.
OK, I didn't realize this bug had been independently discovered and it
looks like I was even involved in the previous discussion. I just
totally forgot about it.
I think, however, that your fix is wrong and this one is right.
Fundamentally, the server is either in normal running, or crash
recovery, or archive recovery. Standby mode is just an optional
behavior of archive recovery, controlling whether or not we keep
retrying once the end of WAL is reached. But there's no reason why the
server should put the contrecord at a different location when recovery
ends depending on that retry behavior. The only thing that matters is
whether we're going to switch timelines.
--
Robert Haas
EDB: http://www.enterprisedb.com