Re: will PITR in 8.0 be usable for "hot spare"/"log - Mailing list pgsql-hackers

From Eric Kerin
Subject Re: will PITR in 8.0 be usable for "hot spare"/"log
Date
Msg-id 1092458994.3717.230.camel@auh5-0478
Whole thread Raw
In response to Re: will PITR in 8.0 be usable for "hot spare"/"log shipping" type of replication  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: will PITR in 8.0 be usable for "hot spare"/"log shipping" type of replication
Re: will PITR in 8.0 be usable for "hot spare"/"log
List pgsql-hackers
On Wed, 2004-08-11 at 16:43, Tom Lane wrote:
> Gaetano Mendola <mendola@bigfoot.com> writes:
> > Tom Lane wrote:
> >> It should work; dunno if anyone has tried it yet.
> 
> > I was thinking about it but I soon realized that actually is
> > impossible to do, postgres replay the log only if during the
> > start the file recover.conf is present in $DATA directory :-(
>
> <SNIP>
>
> Somebody should hack this together and try it during beta.  I don't
> have time myself.
> 
>             regards, tom lane


I've wrote up a very quick, and insanely dirty hack to do log shipping. 
Actually, it's so poorly written I kinda feel ashamed to post the code.

But so far the process looks very promising, with a few caveats. 

The issues I've seen are:
1. Knowing when the master has finished the file transfer transfer to
the backup.
2. Handling the meta-files, (.history, .backup) (eg: not sleeping if
they don't exist)
3. Keeping the backup from coming online before the replay has fully
finished in the event of a failure to copy a file, or other strange
errors (out of memory, etc).

I've got a solution for 1.  I use a control file that contains the name
of the last file that was successfully copied over.  After the program
copies the file, it updates the control file with the new file's name.
The restore program looks in that file for what is the last safe file to
replay, and sleeps if the one it's been told to look for isn't safe yet.

Two is pretty easy, just special case out files ending in .history or
.backup.

Three is a problem I see happening once in a while, and will cause you
to have to recreate the backup database from a backup of master, it
could spell trouble, or at the very least a mad DBA.  A possible fix is
to check the error code returned from restore_command to see if it's
ENOENT before bringing the db online, instead of bringing the database
online at any error.  This might be better as an option though.



Still lots of bugs in my implementation, and my next step is to re-write
it from scratch.  I'm going to keep playing with this and see if I can
get something a little more solid working.

Here's a url to the code as it is right now, works on linux, no promises
with anything else.  http://www.bootseg.com/log_ship.c

For the archive command use:
/path_to_binary/log_ship -a /archive_directory/ %p %f

For the restore_command use:
/path_to_binary/log_ship -r /archive_directory/ %f %p

Any comments are Very appreciated

Thanks, 
Eric Kerin



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [Fwd: Re: [pgsql-hackers-win32] Import from Linux to
Next
From: Tom Lane
Date:
Subject: Re: will PITR in 8.0 be usable for "hot spare"/"log shipping" type of replication