Re: Offline enabling/disabling of data checksums - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: Offline enabling/disabling of data checksums
Date
Msg-id alpine.DEB.2.21.1812281542250.6632@lancre
Whole thread Raw
In response to Re: Offline enabling/disabling of data checksums  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
>>> [...]
>>
>> I'm not sure data checksums are particularly great evidence. For example
>> with the recent fsync issues, we might have ended with partial writes
>> (and thus invalid checksums). The OS migh have even told us about the
>> failure, but we've gracefully ignored it. So I'm afraid data checksums
>> are not a particularly great proof it's not our fault.
>
> They are a great evidence that your data is corrupt. You *want* to know
> that your data is corrupt. Even if our best recommendation is "go restore
> your backups", you still want to know. Otherwise you are sitting around on
> data that's corrupt and you don't know about it.
>
> There are certainly many things we can do to improve the experience. But
> not telling people their data is coorrupt when it is, isn't one of them.

Yep, anyone should want to know if their database is corrupt, compare to 
ignoring the fact.

One reason not to enable it could be if the implementation is not trusted, 
i.e. if false positive (corrupt page detected while the data are okay and 
there was only an issue with computing or storing the checksum) can occur.

There is also the performance impact. I did some quick-and-dirty pgbench 
simple update single thread performance tests to compare with vs without 
checksum. Enabling checksums on these tests seems to induce a 1.4% 
performance penalty, although I'm moderately confident about it given the 
standard deviation. At least it is an indication, and it seems to me that 
it is consistent with other figures previously reported on the list.

-- 
Fabien.


pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: random() (was Re: New GUC to sample log queries)
Next
From: Surafel Temesgen
Date:
Subject: Re: pg_dump multi VALUES INSERT