Re: pg_basebackup for streaming base backups - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: pg_basebackup for streaming base backups
Date
Msg-id AANLkTikWbn=UWg_u4gyyL04ooCo584dhE2ZiPjcjfbUi@mail.gmail.com
Whole thread Raw
In response to Re: pg_basebackup for streaming base backups  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Responses Re: pg_basebackup for streaming base backups  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Re: pg_basebackup for streaming base backups  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Jan 15, 2011 at 21:16, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Hi,
>
> I have an unexpected 5 mins window to do a first reading of the patch,
> so here goes the quick doc and comments proof reading of it. :)

:-)


> Magnus Hagander <magnus@hagander.net> writes:
>> This patch creates pg_basebackup in bin/, being a client program for
>> the streaming base backup feature.
>
> Great!  We have pg_ctl init[db], I think we want pg_ctl clone or some
> other command here to call the binary for us.  What do you think?

That might be useful, but I think we need to settle on the
pg_basebackup contents itself first.

Not sure pg_ctl clone would be the proper name, since it's not
actually a clone at this point (it might be with the second patch I
ust posted that includes the WAL files)


>> One thing I'm thinking about - right now the tool just takes -c
>> <conninfo> to connect to the database. Should it instead be taught to
>> take the connection parameters that for example pg_dump does - one for
>> each of host, port, user, password? (shouldn't be hard to do..)
>
> Consistency is good.
>
> Now, basic first patch reading level review:
>
> I think doc/src/sgml/backup.sgml should include some notes about how
> libpq base backup streaming compares to rsync and the like in term of
> efficiency or typical performances, when to prefer which, etc.  I'll see
> about doing some tests next week.

Yeah, the whole backup chapter may well need some more work after this.


> +      <term><option>--basedir=<replaceable class="parameter">directory</replaceable></option></term>
>
> That should be -D --pgdata, for consistency with pg_dump.

pg_dump doesn't have a -D. I assume you mean pg_ctl / initdb?


> On a quick reading it's unclear from the docs alone how -d and -t leave
> together.  It seems like the options are exclusive but I'd have to ask…

They are. The docs clearly say "Only one of <literal>-d</> and
<literal>-t</> can be specified"


> + * The file will be named base.tar[.gz] if it's for the main data directory
> + * or <tablespaceoid>.tar[.gz] if it's for another tablespace.
>
> Well we have UNIQUE, btree (spcname), so maybe we can use that here?

We could, but that would make it more likely to run into encoding
issues and such - do we restrict what can be in a tablespace name?

Also with a tar named by the oid, you *can* untar it into a directory
in pg_tblspc to recover from if you have to.

Another option, I think Heikki mentioned this on IM at some point, is
to do something like name it <oid>-<name>.tar. That would give us best
of both worlds?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Include WAL in base backup
Next
From: Robert Haas
Date:
Subject: Re: LOCK for non-tables