Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby - Mailing list pgsql-bugs

From Amit Kapila
Subject Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby
Date
Msg-id 003b01cd9228$0dee98a0$29cbc9e0$@kapila@huawei.com
Whole thread Raw
In response to Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-bugs
On Thursday, September 13, 2012 10:32 PM Fujii Masao wrote:
On Thu, Sep 13, 2012 at 9:21 PM, Heikki Linnakangas <hlinnaka@iki.fi> =
wrote:
> On 12.09.2012 22:03, Fujii Masao wrote:
>>
>> On Wed, Sep 12, 2012 at 8:47 PM,<amit.kapila@huawei.com>  wrote:
>>>
>>> The following bug has been logged on the website:
>>>
>>> Bug reference:      7533
>>> Logged by:          Amit Kapila
>>> Email address:      amit.kapila@huawei.com
>>> PostgreSQL version: 9.2.0
>>> Operating system:   Suse
>>> Description:
>>>
>>> M host is primary, S host is standby and CS host is cascaded =
standby.
>>>
>


>> Hmm, I think the CheckRecoveryConsistency() call in the redo loop is
>> misplaced. It's called after we got a record from ReadRecord, but =
*before*
>> replaying it (rm_redo). Even if replaying record X makes the system
>> consistent, we won't check and notice that until we have fetched =
record X+1.
>> In this particular test case, record X is a shutdown checkpoint =
record, but
>> it could as well be a running-xacts record, or the record that =
reaches
>> minRecoveryPoint.
>
>> Does the problem go away if you just move the =
CheckRecoveryConsistency()
>> call *after* rm_redo (attached)?

> No, at least in my case. When recovery starts at shutdown checkpoint =
record and
> there is no record following the shutdown checkpoint, recovery gets in
> wait state
> before entering the main redo apply loop. That is, recovery starts =
waiting for
> new WAL record to arrive, in ReadRecord just before the redo loop. So =
moving
> the CheckRecoveryConsistency() call after rm_redo cannot fix the =
problem which
>I reported. To fix the problem, we need to make the recovery reach the
> consistent
> point before the redo loop, i.e., in the CheckRecoveryConsistency()
> just before the redo loop.

I think may be in that case we need both the fixes, as the problem I =
have reported can be fixed with Heikki's patch.

With Regards,
Amit Kapila.

pgsql-bugs by date:

Previous
From: dfisupport@docfocus.ca
Date:
Subject: BUG #7537: Server will not start up from Windows Service Manager
Next
From: Marko Tiikkaja
Date:
Subject: Re: BUG #7516: PL/Perl crash