Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", File: "nbtsearch.c", Line: 89) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", File: "nbtsearch.c", Line: 89)
Date
Msg-id 26823.1130475510@sss.pgh.pa.us
Whole thread Raw
In response to Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", File: "nbtsearch.c", Line: 89)  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-hackers
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> On Thu, Oct 27, 2005 at 11:53:01PM -0400, Tom Lane wrote:
>> BTW, Jim, any thoughts about how the index got corrupted?  Have you
>> had any crashes on that machine lately?

> Write-through cache on drive array that's not battery backed. Plus, the
> backend has been crashing on a sig 11 about once a week for
> who-knows-how-long. You do the math...

My guess is that this could not have been caused by a backend crash.
I did some analysis and found that one of the four pages had been a
level-1 internal page, not a leaf page (there is a sibling link from
a level-1 page and a downlink from the level-2 root, QED).  It's
very hard to see how to explain the condition of the index as a
result of a single backend crash, and even if you posit one crash
for each page, it's tough to believe that there wouldn't be additional
corruption elsewhere (eg, from incomplete page split operations).  But
AFAICT there's nothing at all wrong anywhere else.  And if there were
multiple crashes, why are the affected pages contiguous?

I think the data was dropped at the disk or filesystem level.  If you
have had power failures then that's certainly not hard to believe.

> BTW, now that the backend is compiled with --enable-debug I hope to find
> out the reason for the random crashes.

Yeah, that is an interesting question.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Philip Yarra
Date:
Subject: Re: pl/pgsql breakage in 8.1b4?
Next
From: Tom Lane
Date:
Subject: Re: pl/pgsql breakage in 8.1b4?