Re: [GENERAL] PANIC: heap_update_redo: no block - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: [GENERAL] PANIC: heap_update_redo: no block
Date
Msg-id 1143569513.32384.35.camel@localhost.localdomain
Whole thread Raw
In response to Re: [GENERAL] PANIC: heap_update_redo: no block  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 2006-03-28 at 10:07 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > On Mon, 2006-03-27 at 22:03 -0500, Tom Lane wrote:
> >> The subsequent replay of the deletion or truncation
> >> will get rid of any unwanted data again.
> 
> > Trouble is, it is not a watertight assumption that there *will be* a
> > subsequent truncation, even if it is a strong one.
> 
> Well, in fact we'll have correctly recreated the page, so I'm not
> thinking that it's necessary or desirable to check this.  What's the
> point?  

We recreated *a* page but we are shying away from exploring *why* we
needed to in the first place. If there was no later truncation then
there absolutely should have been a page there already and the fact
there wasn't one needs to be reported.

I don't want to write that code either, I just think we should.

> "PANIC: we think your filesystem screwed up.  We don't know
> exactly how or why, and we successfully rebuilt all our data, but
> we're gonna refuse to start up anyway."  Doesn't seem like robust
> behavior to me.  

Agreed, which is why I explicitly said we shouldn't do that.

grass_up_filesystem = on should be the only setting we support, but
you're right we can't know why its wrong, but the sysadmin might.

> > Perhaps we do have one systemic problem: systems documentation.
> 
> I agree on that ;-).  The xlog code is really poorly documented.
> I'm going to try to improve the comments for at least the xlogutils
> routines while I'm fixing this.

I'll take a look also.

Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Shared memory
Next
From: Dave Cramer
Date:
Subject: Re: Shared memory