Re: Enabling Checksums - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Enabling Checksums
Date
Msg-id 1353345765.10198.116.camel@jdavis-laptop
Whole thread Raw
In response to Re: Enabling Checksums  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Enabling Checksums  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Mon, 2012-11-19 at 11:48 -0500, Robert Haas wrote:
> I agree that the hazards are not equivalent, but I'm not sure I agree
> that an external utility will never see a torn page while the system
> is on-line.  We have a bunch of code that essentially forces
> full_page_writes=on during a base backup even if it's normally off.  I
> think that's necessary precisely because neither the 8kB write() nor
> the unknown-sized-read used by the external copy program are
> guaranteed to be atomic.

This seems like a standards question that we should be able to answer
definitively:

Is it possible for a reader to see a partial write if both use the same
block size?

Maybe the reason we need full page writes during base backup is because
we don't know the block size of the reader, but if we did know that it
was the same, it would be fine?

If that is not true, then I'm concerned about replicating corruption, or
backing up corrupt blocks over good ones. How do we prevent that? It
seems like a pretty major hole if we can't, because it means the only
safe replication is streaming replication; a base-backup is essentially
unsafe. And it means that even an online background checking utility
would be quite hard to do properly.

Regards,Jeff Davis




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [RFC] Fix div/mul crash and more undefined behavior
Next
From: Heikki Linnakangas
Date:
Subject: Re: Switching timeline over streaming replication