Re: Theory about XLogFlush startup failures - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Theory about XLogFlush startup failures
Date
Msg-id 11414.1012067554@sss.pgh.pa.us
Whole thread Raw
In response to Re: Theory about XLogFlush startup failures  ("Mikheev, Vadim" <vmikheev@SECTORBASE.COM>)
List pgsql-hackers
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
>> So I am still dissatisfied with doing elog(STOP) for this condition,
>> as I regard it as an overly strong reaction to corrupted data;
>> moreover, it does nothing to fix the problem and indeed gets in
>> the way of fixing the problem.

> ... It's not Ok automatically restart
> knowing about errors in data.

Actually, I disagree.  If we come across clearly corrupt data values
(eg, bad length word for a varlena item, or even tuple-header errors 
such as a bad XID), we do not try to force the admin to restore the
database from backup, do we?  A bogus LSN is bad, certainly, but it
is not the end of the world and does not deserve a panic reaction.
At worst it tells us that one data page is corrupt.  A robust system
should report that and keep plugging.

What would be actually useful here is to report which page contains
the bad LSN, so that the admin could look at it and decide what to do.
xlog.c doesn't know that, unfortunately.  I'd be more interested in
expending work to make that happen than in expending work to make
a dbadmin's life more difficult --- and I rank forced stops in the
latter category.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Justin Clift
Date:
Subject: Re: sequence indexes
Next
From: Peter Eisentraut
Date:
Subject: Re: contrib/tree