Re: PostgreSQL Invalid Page Header in Block XXXXX - Mailing list pgsql-general

From Chris Browne
Subject Re: PostgreSQL Invalid Page Header in Block XXXXX
Date
Msg-id 60n05gdffn.fsf@dev6.int.libertyrms.info
Whole thread Raw
List pgsql-general
mailadministrator@canada.com (Anthony) writes:
> Our Database is having errors. We are currently using PostgreSQL to
> store 2.5 Million records per day. The average addition to our primary
> table is 4.5 Gigs of data.
>
> We are doing this on a dual Opteron 244 system with 1 TeraByte of HDD
> space. The drives are 250 Gig Western Digital. The Raid Controller is
> LSI Logic MegaRaid 150-6.
>
> We are getting an error after about 4-5 days worth of data being put
> into the system.
>
> *******************************************************
> ERROR:  invalid page header in block 59305 of relation
> "item_info_2004_04_leaf_category_1"
> *******************************************************
>
> Our Base Server Configuration is as follows.
> PostgreSQL Version= 7.4.2
> x86_64-PC-Linux-GNU
> Compiled with GCC 3.3.3
> XFS File System
> Running on Gentoo Linux 3.3.3 Propolice-3.3-7
>
> Any help on how to solve this probelm would be extremely appreciated.
>
> Even the potential that Tom Lane might respond to this is worth it.

May I point you to the pg_filedump utility?

  <http://sources.redhat.com/rhdb/utilities.html>

It can give you a fair idea of just where the system is blowing up.

I experienced what sounds like the same problem with a system that was
fairly similarly appointed with hardware, albeit with a few
conspicuous differences...

1.  PostgreSQL 7.4.1
2.  FreeBSD 4.9
3.  Berkeley FFS with soft updates
4.  Quad-Xeon, 8GB RAM (only using 4GB of it :-()
5.  AMI MegaRaid controller...
6.  Slightly less disk; 12x74GB SCSI drives

[root@hathi scsi]# cat /proc/scsi/megaraid/1
LSI Logic MegaRAID 1.74 254 commands 16 targs 7 chans 7 luns

What I found in looking at the page with the "invalid page header" was
that it was full of ASCII NUL values.

We had previously had quite a bit of trouble with a different box with
the same hardware configuration running RHAT 7.3, although when I
replaced a 2.4.18 Linux kernel with 2.6.2, those problems evaporated.

The only thing that we have been able to point to on the box in
question is a hardware problem.  In view of the disk being RAIDed, the
causes seem to fall to three things being most likely sorts of
culprits:

 1.  Perhaps the controller is "glitched;"
 2.  Perhaps the controller driver is "glitched;"
 3.  Perhaps there is a RAM problem.

Notice that the list of suspects doesn't include any that actually
relate to database software.

Your best bet is to look for hardware problems.
--
(reverse (concatenate 'string "gro.gultn" "@" "enworbbc"))
http://cbbrowne.com/info/linuxxian.html
Never take life seriously. Nobody gets out alive anyway.

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Data Encryption in PostgreSQL, and a Tutorial.
Next
From: "Barry L. Geipel"
Date:
Subject: Problems setting shared_buffers to large value