Re: Enabling Checksums - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: Enabling Checksums
Date
Msg-id 5140FCB6.5020709@nasby.net
Whole thread Raw
In response to Re: Enabling Checksums  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On 3/7/13 9:31 PM, Bruce Momjian wrote:
>     1 storage
>     2 storage controller
>     3 file system
>     4 RAM
>     5 CPU

I would add 2.5 in there: storage interconnect. iSCSI, FC, what-have-you. Obviously not everyone has that.

> My guess is that storage checksums only cover layer 1, while our patch
> covers layers 1-3, and probably not 4-5 because we only compute the
> checksum on write.

Actually, it depends. In our case, we run 512GB servers and 8GB shared buffers (previous testing has shown that
anythingmuch bigger than 8G hurts performance).
 

So in our case, PG checksums protect a very significant portion of #4.

> If that is correct, the open question is what percentage of corruption
> happens in layers 1-3?

The last bout of corruption we had was entirely coincident with memory failures. IIRC we had 3-4 corruption events on
morethan one server. Everything was running standard ECC (sadly, not 4-bit ECC).
 



pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: matview patch readability/correctness gripe
Next
From: Jim Nasby
Date:
Subject: Re: Using indexes for partial index builds