On Mon, 2012-11-19 at 11:48 -0500, Robert Haas wrote:
> I agree that the hazards are not equivalent, but I'm not sure I agree
> that an external utility will never see a torn page while the system
> is on-line. We have a bunch of code that essentially forces
> full_page_writes=on during a base backup even if it's normally off. I
> think that's necessary precisely because neither the 8kB write() nor
> the unknown-sized-read used by the external copy program are
> guaranteed to be atomic.
This seems like a standards question that we should be able to answer
definitively:
Is it possible for a reader to see a partial write if both use the same
block size?
Maybe the reason we need full page writes during base backup is because
we don't know the block size of the reader, but if we did know that it
was the same, it would be fine?
If that is not true, then I'm concerned about replicating corruption, or
backing up corrupt blocks over good ones. How do we prevent that? It
seems like a pretty major hole if we can't, because it means the only
safe replication is streaming replication; a base-backup is essentially
unsafe. And it means that even an online background checking utility
would be quite hard to do properly.
Regards,Jeff Davis