On Fri, Sep 14, 2012 at 12:21 PM, Amit Kapila <amit.kapila@huawei.com> wrote:
> On Thursday, September 13, 2012 10:32 PM Fujii Masao wrote:
> On Thu, Sep 13, 2012 at 9:21 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> On 12.09.2012 22:03, Fujii Masao wrote:
>>>
>>> On Wed, Sep 12, 2012 at 8:47 PM,<amit.kapila@huawei.com> wrote:
>>>>
>>>> The following bug has been logged on the website:
>>>>
>>>> Bug reference: 7533
>>>> Logged by: Amit Kapila
>>>> Email address: amit.kapila@huawei.com
>>>> PostgreSQL version: 9.2.0
>>>> Operating system: Suse
>>>> Description:
>>>>
>>>> M host is primary, S host is standby and CS host is cascaded standby.
>>>>
>>
>
>
>>> Hmm, I think the CheckRecoveryConsistency() call in the redo loop is
>>> misplaced. It's called after we got a record from ReadRecord, but *before*
>>> replaying it (rm_redo). Even if replaying record X makes the system
>>> consistent, we won't check and notice that until we have fetched record X+1.
>>> In this particular test case, record X is a shutdown checkpoint record, but
>>> it could as well be a running-xacts record, or the record that reaches
>>> minRecoveryPoint.
>>
>>> Does the problem go away if you just move the CheckRecoveryConsistency()
>>> call *after* rm_redo (attached)?
>
>> No, at least in my case. When recovery starts at shutdown checkpoint record and
>> there is no record following the shutdown checkpoint, recovery gets in
>> wait state
>> before entering the main redo apply loop. That is, recovery starts waiting for
>> new WAL record to arrive, in ReadRecord just before the redo loop. So moving
>> the CheckRecoveryConsistency() call after rm_redo cannot fix the problem which
>>I reported. To fix the problem, we need to make the recovery reach the
>> consistent
>> point before the redo loop, i.e., in the CheckRecoveryConsistency()
>> just before the redo loop.
>
> I think may be in that case we need both the fixes, as the problem I have reported can be fixed with Heikki's patch.
Agreed. And we should just add the CheckRecoveryConsistency() call after rm_redo
rather than moving it, as you suggested upthread.
Regards,
--
Fujii Masao