Re: Page Checksums - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Page Checksums
Date
Msg-id CA+U5nM+dJSj16qPDihTxJPk7riJiiP99sAmgab=eMURgB31LBA@mail.gmail.com
Whole thread Raw
In response to Re: Page Checksums  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Page Checksums  (Benedikt Grundmann <bgrundmann@janestreet.com>)
Re: Page Checksums  (Jim Nasby <jim@nasby.net>)
List pgsql-hackers
On Tue, Jan 10, 2012 at 8:04 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 10.01.2012 02:12, Jim Nasby wrote:
>>
>> Filesystem CRCs very likely will not happen to data that's in the cache.
>> For some users, that's a huge amount of data to leave un-protected.
>
>
> You can repeat that argument ad infinitum. Even if the CRC covers all the
> pages in the OS buffer cache, it still doesn't cover the pages in the
> shared_buffers, CPU caches, in-transit from one memory bank to another etc.
> You have to draw the line somewhere, and it seems reasonable to draw it
> where the data moves between long-term storage, ie. disk, and RAM.

We protect each change with a CRC when we write WAL, so doing the same
thing doesn't sound entirely unreasonable, especially if your database
fits in RAM and we aren't likely to be doing I/O anytime soon. The
long term storage argument may no longer apply in a world with very
large memory.

The question is, when exactly would we check the checksum? When we
lock the block, when we pin it? We certainly can't do it on every
access to the block since we don't even track where that happens in
the code.

I think we could add an option to check the checksum immediately after
we pin a block for the first time but it would be very expensive and
sounds like we're re-inventing hardware or OS features again. Work on
50% performance drain, as an estimate.

That is a level of protection no other DBMS offers, so that is either
an advantage or a warning. Jim, if you want this, please do the
research and work out what the probability of losing shared buffer
data in your ECC RAM really is so we are doing it for quantifiable
reasons (via old Google memory academic paper) and to verify that the
cost/benefit means you would actually use it if we built it. Research
into requirements is at least as important and time consuming as
research on possible designs.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Sending notifications from the master to the standby
Next
From: Joel Jacobson
Date:
Subject: Re: Generate call graphs in run-time