Re: PANIC: failed to re-find parent key in "100924" for split pages 1606/1673 - Mailing list pgsql-bugs

From Tom Lane
Subject Re: PANIC: failed to re-find parent key in "100924" for split pages 1606/1673
Date
Msg-id 27285.1231447129@sss.pgh.pa.us
Whole thread Raw
In response to Re: PANIC: failed to re-find parent key in "100924" for split pages 1606/1673  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-bugs
Simon Riggs <simon@2ndQuadrant.com> writes:
> But with a down server, you just force people to do pg_resetxlog, which
> loses both the corruption (probably) and real, useful data (likely) and
> *then* they bring up the server. I don't see why we should force people
> to take a manual action and lose data to bring up the server.

That's all fine, but simply reducing the message level from PANIC to LOG
remains an utterly unacceptable "solution".  What will happen is that
the server will start, the DBA will go back to sleep after ignoring
(most likely, never even reading) the log message, and the corruption
will get worse.  The potential consequences of corruption in a pg_class
index, for example, are just horrid.  Frankly I'd rather "rm -rf $PGDATA"
and force someone to go back to their last backup than let them continue
to run with a database that is known to be broken and the system didn't
do anything more to warn them than emit a LOG message someplace.

(No, I'm not seriously proposing that as a recovery technique.  But it's
no more irresponsible than ignoring a corruption condition.)

            regards, tom lane

pgsql-bugs by date:

Previous
From: Simon Riggs
Date:
Subject: Re: PANIC: failed to re-find parent key in "100924" for split pages 1606/1673
Next
From: Bruce Momjian
Date:
Subject: Re: BUG #4509: array_cat's null behaviour is inconsistent