Thread: corruption bug in 7.2.3-RH

corruption bug in 7.2.3-RH

From

Jonathan Ellis

Date:

07 February 2003, 15:07:18

when trying to vacuum I got this:

FATAL 2:  PageIndexTupleDelete: corrupted page pointers: lower = 21845,
upper = 21845, special = 21845

I experimentally determined the problem was with my "users" table.  I
did a pg_dump -t on it, manually deleted all triggers, dropped the
table, and loaded from the dump.  This fixed it.  (Merely deleting all
rows from the table & re-copying did not.)  For future reference, is
there an easier fix?  Is this fixed in 7.3.2?

-Jonathan

Re: corruption bug in 7.2.3-RH

From

"scott.marlowe"

Date:

07 February 2003, 16:18:30

On Fri, 7 Feb 2003, Jonathan Ellis wrote:

> when trying to vacuum I got this:
>
> FATAL 2:  PageIndexTupleDelete: corrupted page pointers: lower = 21845,
> upper = 21845, special = 21845
>
> I experimentally determined the problem was with my "users" table.  I
> did a pg_dump -t on it, manually deleted all triggers, dropped the
> table, and loaded from the dump.  This fixed it.  (Merely deleting all
> rows from the table & re-copying did not.)  For future reference, is
> there an easier fix?  Is this fixed in 7.3.2?

IMMEDIATELY DO THE FOLLOWING:

Check for bad ram and bad blocks on your hard drives.

Most corruption problems in postgresql servers are caused by either one or
a combination of both of those problems.  memtest86 is good for memory on
x86 boxes, not sure what OS you're running, but some kind of bad block
checker should be run against the drives as well.

Your first reaction on such failures should be to immediate verify the
solidity of the server postgresql is sitting on.  It's a good database,
but it can't overcome flawed hardware.

If your server has no problems, then backup and upgrade to 7.2.4 since
there are some bug fixes for minor duplicate key issues in vacuum tables
and such.  7.3 seems real stable, but it's just different enough for us to
have to plan the rollout where I work, since just tossing it online might
break some poorly written apps we have... :-)

Re: corruption bug in 7.2.3-RH

From

Tom Lane

Date:

07 February 2003, 19:07:42

Jonathan Ellis <jonathan@carnageblender.com> writes:
> when trying to vacuum I got this:
> FATAL 2:  PageIndexTupleDelete: corrupted page pointers: lower = 21845,
> upper = 21845, special = 21845

Hmm ... 21845 = hex 5555, and you have the same in at least three places
in the page header ... you have a badly clobbered page there.  Check
for hardware problems.

> I experimentally determined the problem was with my "users" table.  I
> did a pg_dump -t on it, manually deleted all triggers, dropped the
> table, and loaded from the dump.  This fixed it.

But you probably lost a few rows, like whatever was on the trashed page
(or pages?).  Can you do anything to crosscheck the data you have left?

            regards, tom lane

Re: corruption bug in 7.2.3-RH

From

Jonathan Ellis

Date:

11 February 2003, 16:47:07

Tom Lane wrote:
> Jonathan Ellis <jonathan@carnageblender.com> writes:
>> when trying to vacuum I got this:
>> FATAL 2:  PageIndexTupleDelete: corrupted page pointers: lower = 21845,
>> upper = 21845, special = 21845
>
> Hmm ... 21845 = hex 5555, and you have the same in at least three places
> in the page header ... you have a badly clobbered page there.  Check
> for hardware problems.

memtest86 said 3 of my 4 dimms had bad spots.  ouch.

>> I experimentally determined the problem was with my "users" table.  I
>> did a pg_dump -t on it, manually deleted all triggers, dropped the
>> table, and loaded from the dump.  This fixed it.
>
> But you probably lost a few rows, like whatever was on the trashed page
> (or pages?).  Can you do anything to crosscheck the data you have left?

All the referential integrity triggers restored w/o complaining.  (And
_every_ entry in this table is used as a key in at least two other
tables.)  Looks like I lucked out.

-Jonathan

Re: corruption bug in 7.2.3-RH

From

Dennis Gearon

Date:

11 February 2003, 17:52:21

Boy ****I'LL**** say you lucked out! Postgres is good stuff ... it withstood that problem, I get
more impressed every day by it.


2/11/2003 1:50:40 PM, Jonathan Ellis <jonathan@carnageblender.com> wrote:

>Tom Lane wrote:
>> Jonathan Ellis <jonathan@carnageblender.com> writes:
>>> when trying to vacuum I got this:
>>> FATAL 2:  PageIndexTupleDelete: corrupted page pointers: lower = 21845,
>>> upper = 21845, special = 21845
>>
>> Hmm ... 21845 = hex 5555, and you have the same in at least three places
>> in the page header ... you have a badly clobbered page there.  Check
>> for hardware problems.
>
>memtest86 said 3 of my 4 dimms had bad spots.  ouch.
>
>>> I experimentally determined the problem was with my "users" table.  I
>>> did a pg_dump -t on it, manually deleted all triggers, dropped the
>>> table, and loaded from the dump.  This fixed it.
>>
>> But you probably lost a few rows, like whatever was on the trashed page
>> (or pages?).  Can you do anything to crosscheck the data you have left?
>
>All the referential integrity triggers restored w/o complaining.  (And
>_every_ entry in this table is used as a key in at least two other
>tables.)  Looks like I lucked out.
>
>-Jonathan
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org
>

Re: corruption bug in 7.2.3-RH

From

Andrew Sullivan

Date:

12 February 2003, 11:37:26

On Tue, Feb 11, 2003 at 02:50:40PM -0700, Jonathan Ellis wrote:
> All the referential integrity triggers restored w/o complaining.  (And
> _every_ entry in this table is used as a key in at least two other
> tables.)  Looks like I lucked out.

So the triggers check now that there are no existing problems?
AFAIK, they didn't always.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110