point in time recovery and moving datafiles online - Mailing list pgsql-general

From Marc Munro
Subject point in time recovery and moving datafiles online
Date
Msg-id 1014348942.11682.0.camel@bloodnok.com
Whole thread Raw
List pgsql-general
I'd like to take a crack at implementing point in time recovery.  My
plan is to do this as gently as possible in a number of small
self-contained steps.  I'd appreciate lots of critcial feedback.

Alternatively, if someone else is looking into this please let me know
so I can either back off or help out.

Stage 1

Add hooks for begin_backup and end_backup at a data file level.  Between
the calls begin_backup(myfile) and end_backup(myfile), writes to myfile
will be disabled allowing the file to be safely copied.

Unfortunately, it seems that this approach is fundamentally in conflict
with the principle of a checkpoint, which, as I understand it, is
supposed to flush all unwritten changes to disk.

My solution is to allow the checkpoint to complete without flushing
myfiles buffers, and then on end_backup(myfile) either perform another
checkpoint or just flush the buffers for myfile.  As long as you cannot
shut down the database untile end_backup has completed its checkpoint I
think all should be well.

So, have I missed something?  Would it instead be better to block the
checkpoint until end_backup(myfile) is complete?  Any other ideas?

Stage 2

Add a move_file function.  This will, use the begin/end_backup hooks,
make a copy of a datafile, create a symlink for it where the original
file used to be, and close and re-open the file without altering any
buffers.

There have been a few mutterings on the mailing lists recently that
suggest this would be useful, and it's a relatively easy test case for
the begin/end_backup hooks.

Stage 3

Add a pgtar executable that will create a tar of all data files using
the begin/end_backup hooks.

Stage 4

Provide some sort of archiving mechanism for WAL files.

Stage 5

Provide some control over the recovery process.  This has to deal with
recovery using both up to date control files, and control files from a
backup set.

Sorry for cross posting this to pgsql-general but my last post to
hackers never made it.


--
Marc        marc@bloodnok.com

pgsql-general by date:

Previous
From: Bruce Momjian
Date:
Subject: O'Reilly Conference in July needs speakers
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Feature enhancement request : use of libgda