Re: pgsql error - Mailing list pgsql-general

From Tom Lane
Subject Re: pgsql error
Date
Msg-id 24440.1311650395@sss.pgh.pa.us
Whole thread Raw
In response to Re: pgsql error  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: pgsql error  ("Mcleod, John" <johnm@spicergroup.com>)
List pgsql-general
Merlin Moncure <mmoncure@gmail.com> writes:
> On Mon, Jul 25, 2011 at 3:05 PM, Mcleod, John <johnm@spicergroup.com> wrote:
>> I'm receiving the following error
>> CONTEXT: writing block 614 of relation 394198/412175
>> WARNING: could not write block 614 of 394198/412175
>> DETAIL: Multiple failures --- write error may be permanent.
>> ERROR: xlog flush request 0/34D53680 is not satisfied --- flushed only to
>> 0/34CD1EB0

> This is a fairly low level error that is telling you the WAL could not
> be written out.  Out of drive space?  Data corruption?

Yeah, this looks like the detritus of some previous failure.  There are
basically two possibilities:

1. The problem page's LSN field has gotten trashed so that it appears to
be past the end of WAL.

2. The page actually did get updated by a WAL entry with that LSN,
and then there was a crash for some reason, and the database tried to
recover by replaying WAL, and it hit some problem that caused it to stop
recovering before what had really been the end of WAL.  So now it thinks
the end of WAL is 0/34CD1EB0, but there are page(s) out there with LSNs
past that, and when it finds one you start getting complaints like this.

I doubt theory #1, though, because there are nearby fields in a page
header that evidently weren't trashed or else the page would have been
recognized as being corrupt.  Also the reported LSN is not very far past
end of WAL, which would be unlikely in the event of random corruption.
So I'm betting on #2.

Unfortunately this tells us little about either the cause of the
original crash, or the reason why recovery didn't work properly.  We'd
need a lot more information before speculating about that, for starters
the exact Postgres version and the platform it's running on.

            regards, tom lane

pgsql-general by date:

Previous
From: Jeff Davis
Date:
Subject: Re: [HACKERS] Error calling PG_RETURN_NULL()
Next
From: Toby Corkindale
Date:
Subject: practical Fail-over methods (was: streaming replication trigger file)