On Wed, 16 Jan 2008 10:19:12 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Steve Holdoway <steve.holdoway@firetrust.com> writes:
> > You can be absolutely certain that the tar backup of a file that's changed is a complete waste of time. Because it
changedwhile you were copying it.
>
> That is, no doubt, the reasoning that prompted the gnu tar people to
> make it do what it does, but it has zero to do with reality for
> Postgres' usage in PITR base backups. What we care about is consistency
> on the page level: as long as each page of the backed-up file correctly
> represents *some* state of that page while the backup was in progress,
> everything is okay, because replay of the WAL log will correct any pages
> that are out-of-date, missing, or shouldn't be there at all. And
> Postgres always writes whole pages. So as long as write() and read()
> are atomic --- which is the case on all Unixen I know of --- everything
> works.
>
> (Thinks for a bit...) Actually I guess there's one extra assumption in
> there, which is that tar must issue its reads in multiples of our page
> size. But that doesn't seem like much of a stretch.
>
> regards, tom lane
That's OK for the WAL logs, but what about the initial archive - the recovery's got to start somewhere...