Re: Block-level CRC checks - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Block-level CRC checks
Date
Msg-id 407d949e0912070954h52525034ka3366d2dd2250e52@mail.gmail.com
Whole thread Raw
In response to Re: Block-level CRC checks  (Chuck McDevitt <cmcdevitt@greenplum.com>)
List pgsql-hackers
On Fri, Dec 4, 2009 at 10:47 PM, Chuck McDevitt <cmcdevitt@greenplum.com> wrote:
> A curiosity question regarding torn pages:  How does this work on file systems that don't write in-place, but instead
alwaysdo copy-on-write? 
>
> My example would be Sun's ZFS file system (In Solaris & BSD).  Because of its "snapshot & rollback" functionality, it
neverwrites a page in-place, but instead always copies it to another place on disk.  How does this affect the
corruptioncaused by a torn write? 
>
> Can we end up with horrible corruption on this type of filesystem where we wouldn't on normal file systems, where we
arewriting to a previously zeroed area on disk? 
>
> Sorry if this is a stupid question... Hopefully somebody can reassure me that this isn't an issue.

It's not a stupid question, we're not 100% sure but we believe ZFS
doesn't need full page writes because it's immune to torn pages.

I think the idea of ZFS is that the new partially written page isn't
visible because it's not linked into the tree until it's been
completely written. To me it appears this would depend on the drive
system ordering writes very strictly which seems hard to be sure is
happening. Perhaps this is tied to the tricks they do to avoid
contention on the root, if they do a write barrier before every root
update that seems like it should be sufficient to me, but I don't know
at that level of detail.

--
greg


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: YAML Was: CommitFest status/management
Next
From: Bruce Momjian
Date:
Subject: Re: Adding support for SE-Linux security