Re: PANIC during crash recovery of a recently promoted standby - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: PANIC during crash recovery of a recently promoted standby
Date
Msg-id 20180514033836.GD1528@paquier.xyz
Whole thread Raw
In response to Re: PANIC during crash recovery of a recently promoted standby  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Sat, May 12, 2018 at 07:41:33AM +0900, Michael Paquier wrote:
> pg_ctl promote would wait for the control file to be updated, so you
> cannot use it in the TAP tests to trigger the promotion.  Still I think
> I found one after waking up?  Please note I have not tested it:
> - Use a custom trigger file and then trigger promotion with a signal.
> - Use a sleep command in recovery_end_command to increase the window, as
> what matters is sleeping after CreateEndOfRecoveryRecord updates the
> control file.
> - Issue a restart point on the standby, which will update the control
> file.
> - Stop the standby with immediate mode.
> - Start the standby, it should see unreferenced pages.

I have been looking at that this morning, and actually I have been able
to create a test case where minRecoveryPoint goes backwards using
Pavan's patch.  Using a sleep in recovery_end_command has proved to not
be enough so I had to patch the backend with a couple of sleeps and some
processing, mainly:
- One sleep in CreateRestartPoint to make a restart point wait before
updating the control file, which I set at 5s.
- One sleep just after calling CreateEndOfRecoveryRecord, which has been
set at 20s.
- Trigger an asynchronous checkpoint using IPC::Run::start.
- Trigger a promotion with a trigger file and SIGUSR2 to the
postmaster.

The rest of the test is crafted with some magic wait number and adds
some logs to ease the monitoring of the issue.  In order to get a crash,
I think that you would need to crash the backend after creating the last
WAL records which generate the invalid page references, and also slow
down the last restart point which makes minRecoveryPoint go backwards,
which is err...  Complicated to make deterministic.  Still if you apply
the patch attached you would see log entries on the standby as follows:
2018-05-14 12:24:15.065 JST [17352] LOG:  selected new timeline ID: 2
2018-05-14 12:24:15.074 JST [17352] LOG:  archive recovery complete
2018-05-14 12:24:15.074 JST [17352] WARNING:  CreateEndOfRecoveryRecord: minRecoveryPoint is 0/32C4258 before update
2018-05-14 12:24:15.074 JST [17352] WARNING:  CreateEndOfRecoveryRecord: minRecoveryPoint is 0/0 after update
2018-05-14 12:24:17.067 JST [17353] WARNING:  checkPointCopy.redo =0/30B3D70, lastCheckPoint.redo = 0/31BC208
2018-05-14 12:24:17.067 JST [17353] WARNING:  CreateRestartPoint: minRecoveryPoint is 0/0 before update,
lastCheckPointEndPtris 0/31BC2B0
 
2018-05-14 12:24:17.067 JST [17353] WARNING:  CreateRestartPoint: minRecoveryPoint is 0/31BC2B0 after update

So minRecoveryRecord can go definitely go backwards here, which is not
good.  Attached is a patch which includes Pavan's fix on top of the test
case I crafted with what is in upthread.  You just need to apply it on
HEAD.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: \d doesn't show partitioned table foreign keys
Next
From: Thomas Munro
Date:
Subject: Re: Compiler warnings with --enable-dtrace