On Jan 10, 2012, at 3:07 AM, Simon Riggs wrote:
> I think we could add an option to check the checksum immediately after
> we pin a block for the first time but it would be very expensive and
> sounds like we're re-inventing hardware or OS features again. Work on
> 50% performance drain, as an estimate.
>
> That is a level of protection no other DBMS offers, so that is either
> an advantage or a warning. Jim, if you want this, please do the
> research and work out what the probability of losing shared buffer
> data in your ECC RAM really is so we are doing it for quantifiable
> reasons (via old Google memory academic paper) and to verify that the
> cost/benefit means you would actually use it if we built it. Research
> into requirements is at least as important and time consuming as
> research on possible designs.
Maybe I'm just dense, but it wasn't clear to me how you could use the information in the google paper to extrapolate
datacorruption probability.
I can say this: we have seen corruption from bad memory, and our Postgres buffer pool (8G) is FAR smaller than
availablememory on all of our servers (192G or 512G). So at least in our case, CRCs that protect the filesystem cache
wouldprotect the vast majority of our memory (96% or 98.5%).
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net