Re: pgsql error - Mailing list pgsql-general

From Mcleod, John
Subject Re: pgsql error
Date
Msg-id A2FA197FFED9FA42AD080EE933EFA8AE34C394A7@Spicer-mail.spicergroup.com
Whole thread Raw
In response to Re: pgsql error  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pgsql error  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-general
Thank you for the reply.

At command line, I ran...
"psql  --version"
and received..
"psql  (PostgreSQL) 7.5devel"

The database is sitting on a Windows 2003 Server box.
A mapping application, wrote in PHP, runs with Apache 2.05

I know in the past, the project manager would restart the database by just closing the .bat window, then restart by
double-clickingthe postgis.bat file on the desktop. 
I'm not sure if this was the beginning of the problem.  I've learned to shutdown the database by "Ctrl C".

This batch file has the following...

cd c:\
cd ms4w/apps/pgsql75win/data/
del postmaster.pid

@ECHO OFF
set
PATH=%PATH%; \ms4w\apps\pgsql75win\lib;\ms4w\apps\pgsql75win\bin;\ms4w\apps\pgsql75win\share\contrib

cd c:\
cd ms4w/apps/pgsql75win/
cmd /c "postmaster -D \ms4w\apps\pgsql75win\\data"

I hope this will give you some clues.


John



-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, July 25, 2011 11:20 PM
To: Merlin Moncure
Cc: Mcleod, John; pgsql-general@postgresql.org
Subject: Re: [GENERAL] pgsql error

Merlin Moncure <mmoncure@gmail.com> writes:
> On Mon, Jul 25, 2011 at 3:05 PM, Mcleod, John <johnm@spicergroup.com> wrote:
>> I'm receiving the following error
>> CONTEXT: writing block 614 of relation 394198/412175
>> WARNING: could not write block 614 of 394198/412175
>> DETAIL: Multiple failures --- write error may be permanent.
>> ERROR: xlog flush request 0/34D53680 is not satisfied --- flushed
>> only to
>> 0/34CD1EB0

> This is a fairly low level error that is telling you the WAL could not
> be written out.  Out of drive space?  Data corruption?

Yeah, this looks like the detritus of some previous failure.  There are basically two possibilities:

1. The problem page's LSN field has gotten trashed so that it appears to be past the end of WAL.

2. The page actually did get updated by a WAL entry with that LSN, and then there was a crash for some reason, and the
databasetried to recover by replaying WAL, and it hit some problem that caused it to stop recovering before what had
reallybeen the end of WAL.  So now it thinks the end of WAL is 0/34CD1EB0, but there are page(s) out there with LSNs
pastthat, and when it finds one you start getting complaints like this. 

I doubt theory #1, though, because there are nearby fields in a page header that evidently weren't trashed or else the
pagewould have been recognized as being corrupt.  Also the reported LSN is not very far past end of WAL, which would be
unlikelyin the event of random corruption. 
So I'm betting on #2.

Unfortunately this tells us little about either the cause of the original crash, or the reason why recovery didn't work
properly. We'd need a lot more information before speculating about that, for starters the exact Postgres version and
theplatform it's running on. 

            regards, tom lane

pgsql-general by date:

Previous
From: Sim Zacks
Date:
Subject: Re: Implementing "thick"/"fat" databases
Next
From: Merlin Moncure
Date:
Subject: Re: pgsql error