Thread: Strange database corruption with PostgreSQL 7.4.x on Debian Sarge
Hello! We're running the latest release of PostgreSQL 7.4.13 on a Debian Sarge machine. Postgres has been compiled by oureselves. We have a pretty big database running on this machine, it has about 6.4 GB approximately. One table contains about 55 million rows. Into this table we insert about 500000 rows each day. Our problem is that without any obvious reason the database gets corrupt. The messages we get are: invalid page header in block 437702 of relation "xxxx" We already have tried out some other versions of 7.4. On another machine running Debian Woody with PotgreSQL 7.4.10 we don't have any problems. Kernels are 2.4.33 on the Sarge machine, 2.4.28 on the Woody machine. Both are SMP kernels. Does anyone of you perhaps have some hints what's going wrong here? Best regards, Matthias
On Wed, 2006-09-20 at 14:34 +0200, Matthias.Pitzl@izb.de wrote: > Hello! > > We're running the latest release of PostgreSQL 7.4.13 on a Debian Sarge > machine. Postgres has been compiled by oureselves. > We have a pretty big database running on this machine, it has about 6.4 GB > approximately. One table contains about 55 million rows. > Into this table we insert about 500000 rows each day. Our problem is that > without any obvious reason the database gets corrupt. The messages we get > are: > invalid page header in block 437702 of relation "xxxx" > We already have tried out some other versions of 7.4. On another machine > running Debian Woody with PotgreSQL 7.4.10 we don't have any problems. > Kernels are 2.4.33 on the Sarge machine, 2.4.28 on the Woody machine. Both > are SMP kernels. > Does anyone of you perhaps have some hints what's going wrong here? Most likely causes in these cases tends to be, bad memory, bad hard drive, bad cpu, bad RAID / IDE / SCSI controller, loss of power when writing to IDE drives / RAID controllers with cache with no battery backup. I.e. check your hardware.
Matthias.Pitzl@izb.de writes: > invalid page header in block 437702 of relation "xxxx" I concur with Scott that this sounds suspiciously like a hardware problem ... but have you tried dumping out the bad pages with pg_filedump or even just od? The pattern of damage would help to confirm or disprove the theory. You can find pg_filedump source code at http://sources.redhat.com/rhdb/ regards, tom lane