Re: Database corruption? - Mailing list pgsql-general

From Tatsuo Ishii
Subject Re: Database corruption?
Date
Msg-id 20011031095804H.t-ishii@sra.co.jp
Whole thread Raw
In response to Re: Database corruption?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Database corruption?  (Alvaro Herrera <alvherre@atentus.com>)
Re: Database corruption?  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Database corruption?  ("Dr. Evil" <drevil@sidereal.kz>)
List pgsql-general
> It may be unthinkable hubris to say this, but ... I am starting to
> notice that a larger and larger fraction of serious trouble reports
> ultimately trace to hardware failures, not software bugs.  Seems we've
> done a good job getting data-corruption bugs out of Postgres.
>
> Perhaps we should reconsider the notion of keeping CRC checksums on
> data pages.  Not sure what we could do to defend against bad RAM,
> however.

Good idea.

I have been troubled by a really strange problem. Populating with huge
data (~7GB) cause random failures, for example a misterious unique
constaraint violation, count(*) shows incorrect number, pg_temp*
suddenly disappear (the table in question is a temporary table). These
are really hard to reproduce and happen on 7.0 to current, virtually
any PostgreSQL releases. Even on an identical system, the problems are
sometimes gone after re-initdb...

I now suspect that some hardware failures might be the source of the
trouble. Problem is, I see no sign so far from the standard system
logs, such as syslog or messages.

It would be really nice if PostgreSQL could be protected from such
hardware failures using CRC or whatever...
--
Tatsuo Ishii

pgsql-general by date:

Previous
From: Jason Earl
Date:
Subject: Re: PostgreSQL dirver?
Next
From: Alvaro Herrera
Date:
Subject: Re: Database corruption?