Re: Recovery inconsistencies, standby much larger than primary - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Recovery inconsistencies, standby much larger than primary
Date
Msg-id 20140131111358.GC13199@alap3.anarazel.de
Whole thread Raw
In response to Re: Recovery inconsistencies, standby much larger than primary  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On 2014-01-31 11:09:14 +0000, Greg Stark wrote:
> On Sun, Jan 26, 2014 at 5:45 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> >
> >> We're also seeing log entries about "wal contains reference to invalid
> >> pages" but these errors seem only vaguely correlated. Sometimes we get
> >> the errors but the tables don't grow noticeably and sometimes we don't
> >> get the errors and the tables are much larger.
> >
> > Uhm. I am a bit confused. You see those in the standby's log? At !debug
> > log levels? That'd imply that the standby is dead and needed to be
> > recloned, no? How do you continue after that?

> So in chatting with Heikki last night we came up with a scenario where
> this check is insufficient.

But that seems unrelated to the issue at hand, right?

> If you have multiple checkpoints during the base backup then there
> will be restartpoints during recovery. If the reference to the invalid
> page is before the restartpont then after crashing recovery and coming
> back up the recovery will go forward fine.

We don't perform restartpoints if there are invalid pages
registered. Check the XLogHaveInvalidPages() call in xlog.c.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Recovery inconsistencies, standby much larger than primary
Next
From: Andres Freund
Date:
Subject: Re: Recovery inconsistencies, standby much larger than primary