On Tue, 2006-03-28 at 10:07 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > On Mon, 2006-03-27 at 22:03 -0500, Tom Lane wrote:
> >> The subsequent replay of the deletion or truncation
> >> will get rid of any unwanted data again.
>
> > Trouble is, it is not a watertight assumption that there *will be* a
> > subsequent truncation, even if it is a strong one.
>
> Well, in fact we'll have correctly recreated the page, so I'm not
> thinking that it's necessary or desirable to check this. What's the
> point?
We recreated *a* page but we are shying away from exploring *why* we
needed to in the first place. If there was no later truncation then
there absolutely should have been a page there already and the fact
there wasn't one needs to be reported.
I don't want to write that code either, I just think we should.
> "PANIC: we think your filesystem screwed up. We don't know
> exactly how or why, and we successfully rebuilt all our data, but
> we're gonna refuse to start up anyway." Doesn't seem like robust
> behavior to me.
Agreed, which is why I explicitly said we shouldn't do that.
grass_up_filesystem = on should be the only setting we support, but
you're right we can't know why its wrong, but the sysadmin might.
> > Perhaps we do have one systemic problem: systems documentation.
>
> I agree on that ;-). The xlog code is really poorly documented.
> I'm going to try to improve the comments for at least the xlogutils
> routines while I'm fixing this.
I'll take a look also.
Best Regards, Simon Riggs