Re: CRCs (was: beta testing version) - Mailing list pgsql-hackers

From Horst Herb
Subject Re: CRCs (was: beta testing version)
Date
Msg-id 002901c06094$5e5307e0$fcee2bcb@midgard
Whole thread Raw
In response to RE: CRCs (was: beta testing version)  ("Mikheev, Vadim" <vmikheev@SECTORBASE.COM>)
List pgsql-hackers
> > (I'd also like to see CRCs on all the table blocks as well; is there
> > a place to put them?)
>
> Do we need it? "physical log" feature suggested by Andreas will protect
> us from non atomic data block writes.

CRCs are neccessary because of glitches, hardware failures, operating system
bugs, viruses, etc - a lot of factors which can alter data stored on the
harddisk independend of postgresql. I learned this lesson the hard way when
I wrote a database application for a hospital, where data integrity is
vital.

Logging CRCs with each record gave us proof that data had been corrupted by
"external" factors (we never found out what it was). It was only a few bytes
in a data base with several 100k of records, but still intolerable. Medicine
is heading a way where decisions will be backed up by computerized
algorithms which in turn depend on exact data. A one bit glitch in a
Terabyte database can make the difference between life and death. These
glitches will happen, no doubt. Doesn't matter - as long as you have some
means of proofing your data integrity and some mechanism of alerting you
when shit has happend.

At present I am coordinating another medical project, we have chosen
PostgreSQL as our backend, and the main problem we have is creating
efficient CRC triggers (I'd wish postgres would support generic triggers
that are valid system wide or at least valid for all tables inheriting the
same table) for own homegrown integrity logging.

Horst




pgsql-hackers by date:

Previous
From: "Horst Herb"
Date:
Subject: Re: CRCs (was: beta testing version)
Next
From: Tom Lane
Date:
Subject: Re: Switch pg_ctl's default about waiting?