Re: Streaming a base backup from master - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Streaming a base backup from master
Date
Msg-id AANLkTi=JvFQyzXRYxRb1n01HWCejwnda4RFWukDbezWM@mail.gmail.com
Whole thread Raw
In response to Streaming a base backup from master  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On Fri, Sep 3, 2010 at 13:19, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> It's been discussed before that it would be cool if you could stream a new
> base backup from the master server, via libpq. That way you would not need
> low-level filesystem access to initialize a new standby.
>
> Magnus mentioned today that he started hacking on that, and coincidentally I
> just started experimenting with it yesterday as well :-). So let's get this
> out on the mailing list.
>
> Here's a WIP patch. It adds a new "TAKE_BACKUP" command to the replication
> command set. Upon receiving that command, the master starts a COPY, and
> streams a tarred copy of the data directory to the client. The patch
> includes a simple command-line tool, pg_streambackup, to connect to a server
> and request a backup that you can then redirect to a .tar file or pipe to
> "tar x".
>
> TODO:
>
> * We need a smarter way to do pg_start/stop_backup() with this. At the
> moment, you can only have one backup running at a time, but we shouldn't
> have that limitation with this built-in mechanism.
>
> * The streamed backup archive should contain all the necessary WAL files
> too, so that you don't need to set up archiving to use this. You could just
> point the tiny client tool to the server, and get a backup archive
> containing everything that's necessary to restore correctly.

For this last point, this should of course be *optional*, but it would
be very good to have that option (and probably on by default).


Couple of quick comments that I saw directly differentiated from the
code I have :-) We chatted some about it already, but it should be
included for others...

* It should be possible to pass the backup label through, not just
hardcode it to basebackup

* Needs support for tablespaces. We should either follow the symlinks
and pick up the files, or throw an error if it's there. Silently
delivering an incomplete backup is not a good thing :-)

* Is there a point in adapting the chunk size to the size of the libpq buffers?

FWIW, my implementation was as a user-defined function, which has the
advantage it can run on 9.0. But most likely this code can be ripped
out and provided as a separate backport project for 9.0 if necessary -
no need to have separate codebases.

Other than that, our code is remarkably similar.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: Dave Page
Date:
Subject: Re: Streaming a base backup from master
Next
From: Fujii Masao
Date:
Subject: Re: Synchronization levels in SR