Re: [BUG?] lag of minRecoveryPont in archive recovery - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [BUG?] lag of minRecoveryPont in archive recovery
Date
Msg-id 00da01cdd3a6$52ba37e0$f82ea7a0$@kapila@huawei.com
Whole thread Raw
In response to [BUG?] lag of minRecoveryPont in archive recovery  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: [BUG?] lag of minRecoveryPont in archive recovery  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
On Thursday, December 06, 2012 9:35 AM Kyotaro HORIGUCHI wrote:
> Hello, I have a problem with PostgreSQL 9.2 with Pacemaker.
> 
> HA standby sometime failes to start under normal operation.
> 
> Testing with a bare replication pair showed that the standby failes
> startup recovery under the operation sequence shown below. 9.3dev too,
> but 9.1 does not have this problem. This problem became apparent by the
> invalid-page check of xlog, but
> 9.1 also has same glitch potentially.
> 
> After the investigation, the lag of minRecoveryPoint behind EndRecPtr in
> redo loop seems to be the cause. The lag brings about repetitive redoing
> of unrepeatable xlog sequences such as XLOG_HEAP2_VISIBLE ->
> SMGR_TRUNCATE on the same page. So I did the same aid work as
> xact_redo_commit_internal for smgr_redo. While doing this, I noticed
> that
> CheckRecoveryConsistency() in redo apply loop should be after redoing
> the record, so moved it.

I think moving CheckRecoveryConsistency() after redo apply loop might cause
a problem.
As currently it is done before recoveryStopsHere() function, which can allow
connections 
on HOTSTANDY. But now if due to some reason recovery pauses or stops due to
above function,
connections might not be allowed as CheckRecoveryConsistency() is not
called.

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Pavan Deolasee
Date:
Subject: Re: pg_dump transaction's read-only mode
Next
From: Simon Riggs
Date:
Subject: Re: Commits 8de72b and 5457a1 (COPY FREEZE)