Re: pg_basebackup for streaming base backups - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: pg_basebackup for streaming base backups
Date
Msg-id AANLkTinxUY12XBLqQXu_kfFQm6zdBP8b90NOJp=D=W5i@mail.gmail.com
Whole thread Raw
In response to Re: pg_basebackup for streaming base backups  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: pg_basebackup for streaming base backups  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
On Thu, Jan 20, 2011 at 05:23, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Jan 19, 2011 at 9:37 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>> Great. Thanks for the quick update!
>>>
>>> Here are another comments:
>
> Here are comments against the documents. The other code looks good.

Thanks!

> It's helpful to document what to set to allow pg_basebackup connection.
> That is not only the REPLICATION privilege but also max_wal_senders and
> pg_hba.conf.

Hmm. Yeha, i guess that wouldn't hurt. Will add that.


> + <refsect1>
> +  <title>Options</title>
>
> Can we list the descriptions of option in the same order as
> "pg_basebackup --help" does?
>
> It's helpful to document that the target directory must be specified and
> it must be empty.

Yeah, that's on the list - I just wanted to make any other changes
first before I did that. I based on (no further) feedback and a few
extra questions, I'm going to change it per your suggestion to use -D
<dir> -F <format>, instead of -D/-T, which will change that stuff
anyway. So I'll reorder them at that time.


> +  <para>
> +   The backup will include all files in the data directory and tablespaces,
> +   including the configuration files and any additional files placed in the
> +   directory by third parties. Only regular files and directories are allowed
> +   in the data directory, no symbolic links or special device files.
>
> The latter sentence means that the backup of the database cluster
> created by initdb -X is not supported? Because the symlink to the
> actual WAL directory is included in it.

No, it's not. pg_xlog is specifically excluded, and sent as an empty
directory, so upon restore you will have an empty pg_xlog directory.


> OTOH, I found the following source code comments:
>
> + * Receive a tar format stream from the connection to the server, and unpack
> + * the contents of it into a directory. Only files, directories and
> + * symlinks are supported, no other kinds of special files.
>
> This says that symlinks are supported. Which is true? Is the symlink
> supported only in tar format?

That's actually a *backend* side restriction. If there is a symlink
anywhere other than pg_tblspc in the data directory, we simply won't
send it across (with a warning).

The frontend code supports creating symlinks, both in directory format
and in tar format (actually, in tar format it doesn't do anything, of
course, it just lets it through)

It wouldn't actually be hard to allow the inclusion of symlinks in the
backend side. But it would make verification a lot harder - for
example, if someone symlinked out pg_clog (as an example), we'd back
up the symlink but not the actual files since they're not actually
registered as a tablespace.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: pg_dump directory archive format / parallel pg_dump
Next
From: Magnus Hagander
Date:
Subject: Re: Include WAL in base backup