"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
>> So I am still dissatisfied with doing elog(STOP) for this condition,
>> as I regard it as an overly strong reaction to corrupted data;
>> moreover, it does nothing to fix the problem and indeed gets in
>> the way of fixing the problem.
> ... It's not Ok automatically restart
> knowing about errors in data.
Actually, I disagree. If we come across clearly corrupt data values
(eg, bad length word for a varlena item, or even tuple-header errors
such as a bad XID), we do not try to force the admin to restore the
database from backup, do we? A bogus LSN is bad, certainly, but it
is not the end of the world and does not deserve a panic reaction.
At worst it tells us that one data page is corrupt. A robust system
should report that and keep plugging.
What would be actually useful here is to report which page contains
the bad LSN, so that the admin could look at it and decide what to do.
xlog.c doesn't know that, unfortunately. I'd be more interested in
expending work to make that happen than in expending work to make
a dbadmin's life more difficult --- and I rank forced stops in the
latter category.
regards, tom lane