Invalid page header - Mailing list pgsql-admin

From Ireneusz Pluta
Subject Invalid page header
Date
Msg-id 4C7EC335.8060607@wp.pl
Whole thread Raw
Responses Re: Invalid page header  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-admin
Hello,

I have a server, 8.4.3, where I get intermittent and rather rare cases
of "invalid page headers". Quick search over the pg lists shows a
general advice to "check your hardware". Yes, I need to schedule a
downtime and perform some checks.

However, let me also share with you what I noticed and maybe you can
comment or suggest more than that.

As I said, I already had a few cases of invalid page header on that
server, but did not take an extensive care of them, as they always were
related to the same table, or its indexes. They could be easily dropped
and rebuilt, because that table depended on other tables. So I was happy
with doing just that. There were just a few such cases within 10 months
of lifetime of this server (and that was the actual reason I reported
autovacuum getting messed with invalid page header not taken care of for
a long time, earlier this year).

But the last time the invalid page header happened to another table,
which, actually, is a master source for many other tables in my
database, so I had to really take care of this case. What I have noticed
about this case was:

- this is a costantly growing table collecting raw information. The data
contained in the damaged page was accessed several times after its
insertion within a few hours, before finally a yet another access ended
with "invalid page header" error.

- there was exactly one page damaged. No other damages around. The
system is running on freebsd7.2, ufs with 16k block size, on a raid10
with 256 stripe size, if this matters

- when playing with pg_filedump I noticed that last pages of the table
are always initially reported as damaged, as they come, then, as newer
pages get allocated and filled, these initially bad pages "become
valid", as in the following example repeating the same pg_filedump.

[pgsql@gil ~]$ pg_filedump data/base/18319/36870.43 | grep -B9 -i
"invalid header" | grep ^Block
Block 7460 ********************************************************
Block 11457 ********************************************************
Block 11460 ********************************************************
Block 11461 ********************************************************
[pgsql@gil ~]$ pg_filedump data/base/18319/36870.43 | grep -B9 -i
"invalid header" | grep ^Block
Block 7460 ********************************************************
Block 11460 ********************************************************
Block 11461 ********************************************************
Block 11462 ********************************************************
[pgsql@gil ~]$ pg_filedump data/base/18319/36870.43 | grep -B9 -i
"invalid header" | grep ^Block
Block 7460 ********************************************************
Block 11461 ********************************************************
Block 11462 ********************************************************
Block 11463 ********************************************************

- Block 7460 above is the one which actually got currupted. In spite I
zeroed it with the zero_damaged_pages option it is still reported as invalid

Do the above remarks indicate that something else, other than
hard-to-find hardware issue, might be tracked in a more detailed way?

Thanks

Irek.


pgsql-admin by date:

Previous
From: "Josi Perez (3T Systems)"
Date:
Subject: pgAgent on Windows
Next
From: Tom Lane
Date:
Subject: Re: Invalid page header