Re: PANIC during crash recovery of a recently promoted standby - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: PANIC during crash recovery of a recently promoted standby
Date
Msg-id 20180622.143402.131885418.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: PANIC during crash recovery of a recently promoted standby  (Michael Paquier <michael@paquier.xyz>)
Responses Re: PANIC during crash recovery of a recently promoted standby
List pgsql-hackers
Hello, sorry for the absense and I looked the second patch.

At Fri, 22 Jun 2018 13:45:21 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20180622044521.GC5215@paquier.xyz>
> On Fri, Jun 22, 2018 at 10:08:24AM +0530, Pavan Deolasee wrote:
> > On Fri, Jun 22, 2018 at 9:28 AM, Michael Paquier <michael@paquier.xyz>
> > wrote:
> >> So an extra pair of eyes from another committer would be
> >> welcome.  I am letting that cool down for a couple of days now.
> > 
> > I am not a committer, so don't know if my pair of eyes count, but FWIW the
> > patch looks good to me except couple of minor points.
> 
> Thanks for grabbing some time, Pavan.  Any help is welcome!

in previous mail:
> I have spotted two
> bug where I think the problem is not fixed: when replaying a WAL record
> XLOG_PARAMETER_CHANGE, minRecoveryPoint and minRecoveryPointTLI would
> still get updated from the control file values which can still lead to
> failures as CheckRecoveryConsistency could still happily trigger a
> PANIC, so I think that we had better maintain those values consistent as

The fix of StartupXLOG, CheckRecoveryConsistency, ReadRecrod and
xlog_redo looks (functionally, mendtioned below) fine.

> long as crash recovery runs.  And XLogNeedsFlush() also has a similar
> problem.

Here, on the other hand, this patch turns off
updateMinRecoverypoint if minRecoverPoint is invalid when
RecoveryInProgress() == true. Howerver RecovInProg() == true
means archive recovery is already started and and
minRecoveryPoint *should* be updated t for the
condition. Actually minRecoverypoint is updated just below. If
this is really right thing, I think that some explanation for the
reason is required here.

In xlog_redo there still be "minRecoverypoint != 0", which ought
to be described as "!XLogRecPtrIsInvalid(minRecoveryPoint)". (It
seems the only one. Double negation is a bit uneasy but there are
many instance of this kind of coding.)

# I'll go all-out next week.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: Threat models for DB cryptography (Re: [Proposal] Table-levelTransparent Data Encryption (TDE) and Key) Management Service (KMS)
Next
From: Jeevan Chalke
Date:
Subject: Re: Server crashed with TRAP: FailedAssertion("!(parallel_workers >0)" when partitionwise_aggregate true.