Re: Enabling Checksums - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: Enabling Checksums |
Date | |
Msg-id | 1362447567.23497.438.camel@sussancws0025 Whole thread Raw |
In response to | Re: Enabling Checksums (Heikki Linnakangas <hlinnakangas@vmware.com>) |
List | pgsql-hackers |
On Mon, 2013-03-04 at 23:11 +0200, Heikki Linnakangas wrote: > Of course not. But if we can get away without checksums in Postgres, > that's better, because then we don't need to maintain that feature in > Postgres. If the patch gets committed, it's not mission accomplished. > There will be discussion and need for further development on things like > what to do if you get a checksum failure, patches to extend the > checksums to cover things like the clog and other non-data files and so > forth. And it's an extra complication that will need to be taken into > account when developing other new features; in particular, hint bit > updates need to write a WAL record. Even if you have all the current > hint bits covered, it's an extra hurdle for future patches that might > want to have hint bits in e.g new index access methods. The example you chose of adding a hint bit is a little overstated -- as far as I can tell, setting a hint bit follows pretty much the same pattern as before, except that I renamed the function to MarkBufferDirtyHint(). But I agree in general. If complexity can be removed or avoided, that is a very good thing. But right now, we have no answer to a real problem that other databases do have an answer for. To me, the benefit is worth the cost. We aren't going down an irreversible path by adding checksums. If every platform has a good checksumming filesystem and there is no demand for the postgres code any more, we can deprecate it and remove it. But at least users would have something between now and then. > The PostgreSQL project would not be depending on it, any more than the > project depends on filesystem snapshots for backup purposes, or the OS > memory manager for caching. I don't understand your analogies at all. We have WAL-protected base backups so that users can get a consistent snapshot without filesystem snapshots. To follow the analogy, we want postgres checksums so that the user can be protected without filesystem checksums. I would agree with you if we could point users somewhere and actually recommend something and say "what you're doing now is wrong, do X instead" (though if there is only one such X, we are dependent on it). But even if we fast forward to three years from now: if someone shows up saying that XFS gives him the best performance, but wants checksums, will we really be able to say "you are wrong to be using XFS; use Btrfs"? One of the things I like about postgres is that we don't push a lot of hard trade-offs on users. Several people (including you) put in effort recently to support unlogged gist indexes. Are there some huge number of users there that can't live without unlogged gist indexes? Probably not. But that is one less thing that potential users have to trade away, and one less thing to be confused or frustrated about. I want to get to the point where checksums are the default, and only advanced users would disable them. If that point comes in the form of checksumming filesystems that are fast enough and enabled by default on most of the platforms we support, that's fine with me. But I'm not very sure that it will happen that way ever, and certainly not soon. > > If btrfs with checksums is 10% slower than ext4 with postgres checksums, > > does that mean we should commit the postgres checksums? > > In my opinion, a 10% gain would not be worth it, and we should not > commit in that case. > > > On the other side of the coin, if btrfs with checksums is exactly the > > same speed as ext4 with no postgres checksums (i.e. checksums are free > > if we use btrfs), does that mean postgres checksums should be rejected? > > Yes, I think so. I'm sure at least some others will disagree; Greg > already made it quite clear that he doesn't care how the performance of > this compares with btrfs. If all paths lead to rejection, what are these tests supposed to accomplish, exactly? Regards,Jeff Davis
pgsql-hackers by date: