Re: Recovery inconsistencies, standby much larger than primary - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Recovery inconsistencies, standby much larger than primary
Date
Msg-id 27201.1391723309@sss.pgh.pa.us
Whole thread Raw
In response to Re: Recovery inconsistencies, standby much larger than primary  (Greg Stark <stark@mit.edu>)
Responses Re: Recovery inconsistencies, standby much larger than primary  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
Greg Stark <stark@mit.edu> writes:
> Both the primary and the standby were 9.1.11 from the get-go. The
> database the primary was forked off of was 9.1.10 but as far as I can
> tell the primary in the current pair has no problems.

> What's worse is we created a new standby from the same base backup and
> replayed the same records and it didn't reproduce the problem. This
> means either it's a hardware problem -- but we've seen it on multiple
> standbys on this database and at least one other database which is in
> a different data centre -- or it's a race condition --but that's hard
> to credit in the recovery code which is basically single-threaded.

> And these records are from before the standby reaches a consistency so
> it's hard to see how a connection from a hot standby client could
> cause any kind of race condition. The only other thread that could
> conceivably cause a heisenbug is the bgwriter. It's hard to imagine
> how a race condition in there could be so easy to hit that it would
> happen four times on one restore but otherwise go mostly unnoticed.

I had noticed that the WAL records that were mis-replayed seemed to
be bunched pretty close together (two of them even adjacent).  Could
you confirm that?  If so, it seems like we're looking for some condition
that makes mis-replay fairly probable for a period of time, but in
itself might be quite improbable.  Not that that helps much at
nailing it down.

You might well be on to something with the bgwriter idea, considering
that none of the WAL replay code was originally written with any
concurrent execution in mind.  We might've missed some place where
additional locking is needed.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: mvcc catalo gsnapshots and TopTransactionContext
Next
From: Andres Freund
Date:
Subject: Re: mvcc catalo gsnapshots and TopTransactionContext