Re: point in time recovery and moving datafiles online - Mailing list pgsql-hackers

From Tom Lane
Subject Re: point in time recovery and moving datafiles online
Date
Msg-id 14433.1014355676@sss.pgh.pa.us
Whole thread Raw
In response to Re: point in time recovery and moving datafiles online  (Marc Munro <marc@bloodnok.com>)
Responses Re: point in time recovery and moving datafiles online  (Marc Munro <marc@bloodnok.com>)
Re: point in time recovery and moving datafiles online  (Tatsuo Ishii <t-ishii@sra.co.jp>)
List pgsql-hackers
Marc Munro <marc@bloodnok.com> writes:
> On Thu, 2002-02-21 at 19:52, Tom Lane wrote:
>> It seems to me that you can get the desired results without any
>> locking.  Assume that you start archiving the WAL just after a
>> checkpoint record.  Also, start copying data files to your backup
>> medium.  Some not inconsiderable time later, you are done copying
>> data files.  You continue copying off and archiving WAL entries.
>> You cannot say that the copied data files correspond to any particular
>> point in the WAL, or that they form a consistent set of data at all
>> --- but if you were to reload them and replay the WAL into them
>> starting from the checkpoint, then you *would* have a consistent set
>> of files once you reached the point in the WAL corresponding to the
>> end-time of the data file backup.  You could stop there, or continue
>> WAL replay to any later point in time.

> If I understand you correctly this is exactly what I was thinking, based
> on Oracle recovery.  But we must still prevent writes to each data file
> as we back it up, so that it remains internally consistent.

No, you're missing my point.  You don't need intra-file consistency any
more than you need cross-file consistency.  You merely need to be sure
that you have captured all the state of pages that are not updated
anywhere in the series of WAL entries that you have.

I had originally started to compose email suggesting that locking on a
per-disk-page basis (not a per-file basis) would be better, but I do
not believe you need even that, for two reasons:

1. PG will always write changes to data files in page-size write
operations.  The Unix kernel guarantees that these writes appear atomic
from the point of view of other processes.  So the data-file-backup
process will see pagewise consistent data in any case.

2. Even if the backup process managed to acquire an inconsistent
(partially updated) copy of a page due to a concurrent write by a
Postgres backend, we do not care.  The WAL activity is designed to
ensure recovery from partial-page disk writes, and backing up such an
inconsistent page copy would be isomorphic to a system failure after a
partial page write.  Replay of the WAL will ensure that the page will be
fully written from the WAL data.

In short, all you need is a mechanism for archiving off the WAL data and
locating a checkpoint record in the WAL as a starting point for replay.
Your data-file backup mechanism can be plain ol' tar or cp -r.  No
interlocks needed or wanted.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Are stored procedures pre-compiled?
Next
From: Bruce Momjian
Date:
Subject: Re: Are stored procedures pre-compiled?