Thread: Re: PostgreSQL Invalid Page Header in Block XXXXX

Re: PostgreSQL Invalid Page Header in Block XXXXX

From
Chris Browne
Date:
mailadministrator@canada.com (Anthony) writes:
> Our Database is having errors. We are currently using PostgreSQL to
> store 2.5 Million records per day. The average addition to our primary
> table is 4.5 Gigs of data.
>
> We are doing this on a dual Opteron 244 system with 1 TeraByte of HDD
> space. The drives are 250 Gig Western Digital. The Raid Controller is
> LSI Logic MegaRaid 150-6.
>
> We are getting an error after about 4-5 days worth of data being put
> into the system.
>
> *******************************************************
> ERROR:  invalid page header in block 59305 of relation
> "item_info_2004_04_leaf_category_1"
> *******************************************************
>
> Our Base Server Configuration is as follows.
> PostgreSQL Version= 7.4.2
> x86_64-PC-Linux-GNU
> Compiled with GCC 3.3.3
> XFS File System
> Running on Gentoo Linux 3.3.3 Propolice-3.3-7
>
> Any help on how to solve this probelm would be extremely appreciated.
>
> Even the potential that Tom Lane might respond to this is worth it.

May I point you to the pg_filedump utility?

  <http://sources.redhat.com/rhdb/utilities.html>

It can give you a fair idea of just where the system is blowing up.

I experienced what sounds like the same problem with a system that was
fairly similarly appointed with hardware, albeit with a few
conspicuous differences...

1.  PostgreSQL 7.4.1
2.  FreeBSD 4.9
3.  Berkeley FFS with soft updates
4.  Quad-Xeon, 8GB RAM (only using 4GB of it :-()
5.  AMI MegaRaid controller...
6.  Slightly less disk; 12x74GB SCSI drives

[root@hathi scsi]# cat /proc/scsi/megaraid/1
LSI Logic MegaRAID 1.74 254 commands 16 targs 7 chans 7 luns

What I found in looking at the page with the "invalid page header" was
that it was full of ASCII NUL values.

We had previously had quite a bit of trouble with a different box with
the same hardware configuration running RHAT 7.3, although when I
replaced a 2.4.18 Linux kernel with 2.6.2, those problems evaporated.

The only thing that we have been able to point to on the box in
question is a hardware problem.  In view of the disk being RAIDed, the
causes seem to fall to three things being most likely sorts of
culprits:

 1.  Perhaps the controller is "glitched;"
 2.  Perhaps the controller driver is "glitched;"
 3.  Perhaps there is a RAM problem.

Notice that the list of suspects doesn't include any that actually
relate to database software.

Your best bet is to look for hardware problems.
--
(reverse (concatenate 'string "gro.gultn" "@" "enworbbc"))
http://cbbrowne.com/info/linuxxian.html
Never take life seriously. Nobody gets out alive anyway.

Re: PostgreSQL Invalid Page Header in Block XXXXX

From
mailadministrator@canada.com (Anthony)
Date:
Chris Browne <cbbrowne@acm.org> wrote in message news:<60n05gdffn.fsf@dev6.int.libertyrms.info>...
> mailadministrator@canada.com (Anthony) writes:
> > Our Database is having errors. We are currently using PostgreSQL to
> > store 2.5 Million records per day. The average addition to our primary
> > table is 4.5 Gigs of data.
> >
> > We are doing this on a dual Opteron 244 system with 1 TeraByte of HDD
> > space. The drives are 250 Gig Western Digital. The Raid Controller is
> > LSI Logic MegaRaid 150-6.
> >
> > We are getting an error after about 4-5 days worth of data being put
> > into the system.
> >
> > *******************************************************
> > ERROR:  invalid page header in block 59305 of relation
> > "item_info_2004_04_leaf_category_1"
> > *******************************************************
> >
> > Our Base Server Configuration is as follows.
> > PostgreSQL Version= 7.4.2
> > x86_64-PC-Linux-GNU
> > Compiled with GCC 3.3.3
> > XFS File System
> > Running on Gentoo Linux 3.3.3 Propolice-3.3-7
> >
> > Any help on how to solve this probelm would be extremely appreciated.
> >
> > Even the potential that Tom Lane might respond to this is worth it.
>
> May I point you to the pg_filedump utility?
>
>   <http://sources.redhat.com/rhdb/utilities.html>
>
> It can give you a fair idea of just where the system is blowing up.
>
> I experienced what sounds like the same problem with a system that was
> fairly similarly appointed with hardware, albeit with a few
> conspicuous differences...
>
> 1.  PostgreSQL 7.4.1
> 2.  FreeBSD 4.9
> 3.  Berkeley FFS with soft updates
> 4.  Quad-Xeon, 8GB RAM (only using 4GB of it :-()
> 5.  AMI MegaRaid controller...
> 6.  Slightly less disk; 12x74GB SCSI drives
>
> [root@hathi scsi]# cat /proc/scsi/megaraid/1
> LSI Logic MegaRAID 1.74 254 commands 16 targs 7 chans 7 luns
>
> What I found in looking at the page with the "invalid page header" was
> that it was full of ASCII NUL values.
>
> We had previously had quite a bit of trouble with a different box with
> the same hardware configuration running RHAT 7.3, although when I
> replaced a 2.4.18 Linux kernel with 2.6.2, those problems evaporated.
>
> The only thing that we have been able to point to on the box in
> question is a hardware problem.  In view of the disk being RAIDed, the
> causes seem to fall to three things being most likely sorts of
> culprits:
>
>  1.  Perhaps the controller is "glitched;"
>  2.  Perhaps the controller driver is "glitched;"
>  3.  Perhaps there is a RAM problem.
>
> Notice that the list of suspects doesn't include any that actually
> relate to database software.
>
> Your best bet is to look for hardware problems.

We ran a full RAM test for 15 hours... it came up with NO problems. We
are running a more current version of the Kernel than you list above
so shouldn't the driver and or controller issues that you think were
fixed in the 2.6.2 Kernal be rolled up in the 2.6.3.

1) We are going to setup multiple partitions for differnt file
systems... and try the PostgreSQL database on each of those systems
and see if the errors persist.

Thank you for your response or thoughts on this plan of attacking the
problem.

Sincerely,

Anthony