Re: point in time recovery and moving datafiles online - Mailing list pgsql-hackers

From Marc Munro
Subject Re: point in time recovery and moving datafiles online
Date
Msg-id 1014354447.12219.3.camel@bloodnok.com
Whole thread Raw
In response to Re: point in time recovery and moving datafiles online  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: point in time recovery and moving datafiles online  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom,
Many thanks for your reply.  This is exactly the sort of feedback I
need.

On Thu, 2002-02-21 at 19:52, Tom Lane wrote:
> [ pg-general removed from cc: list, as this is off topic for it ]
> 
> Marc Munro <marc@bloodnok.com> writes:
> > Add hooks for begin_backup and end_backup at a data file level.  Between
> > the calls begin_backup(myfile) and end_backup(myfile), writes to myfile
> > will be disabled allowing the file to be safely copied.
> 
> And the writes are going to go where instead?  If you intend to just
> hold the dirty pages in memory, the system will grind to a halt in no
> time, ie as soon as it runs out of spare buffers.  This strikes me as
> only marginally better than "shut down the database while you copy the
> files".

The intention is to lock only one file at a time, take a clean snapshot
of it, and only after allowing writes to its buffers to continue, go on
to the next file.  I would hope that the time it would take to copy a
single 1G data file would be considerably less than the time needed to
dirty all in-memory buffers but maybe that is just wild optimism.
> Perhaps more to the point, I'm not following how this helps achieve
> point-in-time recovery.  I suppose what you are after is to get an
> instantaneous snapshot of the data files that could be used as a
> starting point for replaying the WAL --- but AFAICS you'd need a
> snapshot that's instantaneous across the *whole* database, ie,
> all the data files are in the state corresponding to the chosen
> starting point for the WAL.  Locking and copying one file at a time
> doesn't get you there.

Actually I'm not trying to get a consistent snapshot.  I just want each
file to be internally consistent.  Then, on recovery, we should be able
to replay the WAL to reach; first a consistent state, then any point in
time from that consistent state to the end of the WAL entries.

I guess I should have explained that ;-)

> It seems to me that you can get the desired results without any
> locking.  Assume that you start archiving the WAL just after a
> checkpoint record.  Also, start copying data files to your backup
> medium.  Some not inconsiderable time later, you are done copying
> data files.  You continue copying off and archiving WAL entries.
> You cannot say that the copied data files correspond to any particular
> point in the WAL, or that they form a consistent set of data at all
> --- but if you were to reload them and replay the WAL into them
> starting from the checkpoint, then you *would* have a consistent set
> of files once you reached the point in the WAL corresponding to the
> end-time of the data file backup.  You could stop there, or continue
> WAL replay to any later point in time.

If I understand you correctly this is exactly what I was thinking, based
on Oracle recovery.  But we must still prevent writes to each data file
as we back it up, so that it remains internally consistent.  This is the
point of the begin/end_backup hooks.  Managing the archiving of the WAL
files is down on my list for doing later (one baby step at a time).

-- 
Marc        marc@bloodnok.com


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Solaris ISM Testing
Next
From: Bruce Momjian
Date:
Subject: Re: Strange problem when upgrading to 7.2 with pg_upgrade.