Re: right sibling is not next child - Mailing list pgsql-bugs

From Kevin Grittner
Subject Re: right sibling is not next child
Date
Msg-id 44351487.EE98.0025.0@wicourts.gov
Whole thread Raw
In response to Re: right sibling is not next child  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: right sibling is not next child  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
>>> On Thu, Apr 6, 2006 at 12:57 pm, in message
<25913.1144346246@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Right now the postmaster refuses to start.  What is the best way to
get
>> past that to try what you suggest?
>
>> [2006- 04- 06 07:22:50.378 ] 3984 PANIC:  heap_clean_redo: no block
>
> Hm, did this start happening immediately after the other problem?

This started happening on the first attempt to start the postmaster
after the other error, which left the postmaster down, apparently after
a failed restart attempt.

> That would suggest that you've got worse problems than just a
corrupt
> index.  You weren't by any chance running with full_page_writes =
off
> were you?

Yes we were.  Apparently I have misunderstood the implications of this.
 Somehow I had convinced myself that this setting was relatively safe in
our environment, due to our battery-backed controllers.  I'd convinced
myself, after reading carefully through the documentation of this
setting, that I would be OK as long as that functioned correctly, and
have problems regardless of this setting if it didn't.  If you show me
where I went wrong, maybe I can suggest a patch to the docs to prevent
others from going down the wrong path in this regard.  (Of course, maybe
it's all there and I just had a bad day when I thought this through.)

> You could get past the startup failure with pg_resetxlog, but it's
not
> clear whether you'd have a consistent database afterward.  What I'd
> suggest first is saving a copy of the entire $PGDATA tree for
forensic
> purposes

We already have this forensic copy and a replacement production copy on
this box.  I think we'll need to copy to another box to get a second
forensic copy, to avoid risking an out-of-space condition.  That can be
done, but it'll take a few hours.

> (not to mention being able to go back to that state if you need
> to).

That's not an issue for production purposes.

> Is there any chance of letting someone else have a look at the
database
> contents?

There is a lot of data in the database which is confidential by law.
I'd have to jump through a lot of hoops to get anyone to even consider
letting me ship it off site.  If you're asking whether you could access
in to our site, that might be arranged.

-Kevin

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: right sibling is not next child
Next
From: Tom Lane
Date:
Subject: Re: right sibling is not next child