database corruption - Mailing list pgsql-admin

From Ian Westmacott
Subject database corruption
Date
Msg-id 1113598532.8214.175.camel@spectre.intellivid.com
Whole thread Raw
Responses Re: database corruption  (Chris Travers <chris@travelamericas.com>)
List pgsql-admin
For several weeks now we have been experiencing fairly
severe database corruption upon clean reboot.  It is very
repeatable, and the corruption is of the following forms:

ERROR:  could not access status of transaction foo
DETAIL:  could not open file "bar": No such file or directory

ERROR:  invalid page header in block foo of relation "bar"

ERROR:  uninitialized page in block foo of relation "bar"


At first, we believed this was related to XFS, and have
been pursuing investigations along those lines.  However,
we have now experienced the exact same problem with JFS.

Here are some details:

- Postgres 7.4.2
- 2.6.6 kernel.org kernel
- dedicated database partition
- repeatable with XFS and JFS (have not seen on ext3)
- repeatable with and without Linux software RAID 0
- repeatable with IDE and SATA
- repeatable with and without fsync, and with fdatasync
- repeatable on multiple systems


I have two questions:

- any known reason why this might be occurring?  (we must
  have something wrong, for this high rate of severe
  error).

- if I don't care about losing data, and am not interested
  in trying to recover anything, how can I arrange for
  Postgres to proceed normally?  I know about
  zero_damaged_pages, but this doesn't help with missing
  transaction files and such.  Is there any way to get
  Postgres to chuck anything bad and proceed?

Thanks,

    --Ian



pgsql-admin by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: cost of empty fields
Next
From: "Carla Villalobos"
Date:
Subject: unsubscribe