Re: BUG #15346: Replica fails to start after the crash - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: BUG #15346: Replica fails to start after the crash
Date
Msg-id 20180831.094846.52751456.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: BUG #15346: Replica fails to start after the crash  (Michael Paquier <michael@paquier.xyz>)
Responses Re: BUG #15346: Replica fails to start after the crash
Re: BUG #15346: Replica fails to start after the crash
List pgsql-hackers
At Thu, 30 Aug 2018 11:57:05 -0700, Michael Paquier <michael@paquier.xyz> wrote in
<20180830185705.GF15446@paquier.xyz>
> On Thu, Aug 30, 2018 at 08:31:36PM +0200, Alexander Kukushkin wrote:
> > 2018-08-30 19:34 GMT+02:00 Michael Paquier <michael@paquier.xyz>:
> >> I have been struggling for a couple of hours to get a deterministic test
> >> case out of my pocket, and I did not get one as you would need to get
> >> the bgwriter to flush a page before crash recovery finishes, we could do
> > 
> > In my case the active standby server has crashed, it wasn't in the
> > crash recovery mode.
> 
> That's what I meant, a standby crashed and then restarted, doing crash
> recovery before moving on with archive recovery once it was done with
> all its local WAL.
> 
> > Minimum recovery ending location is AB3/4A1B3118, but at the same time
> > I managed to find pages from 0000000500000AB300000053 on disk (at
> > least in the index files). That could only mean that bgwriter was
> > flushing dirty pages, but pg_control wasn't properly updated and it
> > happened not during recovery after hardware crash, but while the
> > postgres was running before the hardware crash.
> 
> Exactly, that would explain the incorrect reference.
> 
> > The only possible way to recover such standby - cut off all possible
> > connections and let it replay all WAL files it managed to write to
> > disk before the first crash.
> 
> Yeah...  I am going to apply the patch after another lookup, that will
> fix the problem moving forward.  Thanks for checking by the way.

Please wait a bit.. I have a concern about this.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_verify_checksums and -fno-strict-aliasing
Next
From: Michael Paquier
Date:
Subject: Re: BUG #15346: Replica fails to start after the crash