Thread: pg_basebackup for streaming base backups

pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

15 January 2011, 14:10:59

This patch creates pg_basebackup in bin/, being a client program for
the streaming base backup feature.

I think it's more or less done now. I've again split it out of
pg_streamrecv, because it had very little shared code with that
(basically just the PQconnectdb() wrapper).

One thing I'm thinking about - right now the tool just takes -c
<conninfo> to connect to the database. Should it instead be taught to
take the connection parameters that for example pg_dump does - one for
each of host, port, user, password? (shouldn't be hard to do..)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachment

pg_basebackup.patch

Re: pg_basebackup for streaming base backups

From

Dimitri Fontaine

Date:

15 January 2011, 16:16:40

Hi,

I have an unexpected 5 mins window to do a first reading of the patch,
so here goes the quick doc and comments proof reading of it. :)

Magnus Hagander <magnus@hagander.net> writes:
> This patch creates pg_basebackup in bin/, being a client program for
> the streaming base backup feature.

Great!  We have pg_ctl init[db], I think we want pg_ctl clone or some
other command here to call the binary for us.  What do you think?

> One thing I'm thinking about - right now the tool just takes -c
> <conninfo> to connect to the database. Should it instead be taught to
> take the connection parameters that for example pg_dump does - one for
> each of host, port, user, password? (shouldn't be hard to do..)

Consistency is good.

Now, basic first patch reading level review:

I think doc/src/sgml/backup.sgml should include some notes about how
libpq base backup streaming compares to rsync and the like in term of
efficiency or typical performances, when to prefer which, etc.  I'll see
about doing some tests next week.

+      <term><option>--basedir=<replaceable class="parameter">directory</replaceable></option></term>

That should be -D --pgdata, for consistency with pg_dump.

On a quick reading it's unclear from the docs alone how -d and -t leave
together.  It seems like the options are exclusive but I'd have to ask…

+ * The file will be named base.tar[.gz] if it's for the main data directory
+ * or <tablespaceoid>.tar[.gz] if it's for another tablespace.

Well we have UNIQUE, btree (spcname), so maybe we can use that here?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

15 January 2011, 17:14:27

On Sat, Jan 15, 2011 at 21:16, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Hi,
>
> I have an unexpected 5 mins window to do a first reading of the patch,
> so here goes the quick doc and comments proof reading of it. :)

:-)


> Magnus Hagander <magnus@hagander.net> writes:
>> This patch creates pg_basebackup in bin/, being a client program for
>> the streaming base backup feature.
>
> Great!  We have pg_ctl init[db], I think we want pg_ctl clone or some
> other command here to call the binary for us.  What do you think?

That might be useful, but I think we need to settle on the
pg_basebackup contents itself first.

Not sure pg_ctl clone would be the proper name, since it's not
actually a clone at this point (it might be with the second patch I
ust posted that includes the WAL files)


>> One thing I'm thinking about - right now the tool just takes -c
>> <conninfo> to connect to the database. Should it instead be taught to
>> take the connection parameters that for example pg_dump does - one for
>> each of host, port, user, password? (shouldn't be hard to do..)
>
> Consistency is good.
>
> Now, basic first patch reading level review:
>
> I think doc/src/sgml/backup.sgml should include some notes about how
> libpq base backup streaming compares to rsync and the like in term of
> efficiency or typical performances, when to prefer which, etc.  I'll see
> about doing some tests next week.

Yeah, the whole backup chapter may well need some more work after this.


> +      <term><option>--basedir=<replaceable class="parameter">directory</replaceable></option></term>
>
> That should be -D --pgdata, for consistency with pg_dump.

pg_dump doesn't have a -D. I assume you mean pg_ctl / initdb?


> On a quick reading it's unclear from the docs alone how -d and -t leave
> together.  It seems like the options are exclusive but I'd have to ask…

They are. The docs clearly say "Only one of <literal>-d</> and
<literal>-t</> can be specified"


> + * The file will be named base.tar[.gz] if it's for the main data directory
> + * or <tablespaceoid>.tar[.gz] if it's for another tablespace.
>
> Well we have UNIQUE, btree (spcname), so maybe we can use that here?

We could, but that would make it more likely to run into encoding
issues and such - do we restrict what can be in a tablespace name?

Also with a tar named by the oid, you *can* untar it into a directory
in pg_tblspc to recover from if you have to.

Another option, I think Heikki mentioned this on IM at some point, is
to do something like name it <oid>-<name>.tar. That would give us best
of both worlds?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Euler Taveira de Oliveira

Date:

15 January 2011, 17:46:52

Em 15-01-2011 15:10, Magnus Hagander escreveu:
> One thing I'm thinking about - right now the tool just takes -c
> <conninfo>  to connect to the database. Should it instead be taught to
> take the connection parameters that for example pg_dump does - one for
> each of host, port, user, password? (shouldn't be hard to do..)
>
+1.


--   Euler Taveira de Oliveira  http://www.timbira.com/

Re: pg_basebackup for streaming base backups

From

Dimitri Fontaine

Date:

15 January 2011, 18:10:23

Magnus Hagander <magnus@hagander.net> writes:
> Not sure pg_ctl clone would be the proper name, since it's not
> actually a clone at this point (it might be with the second patch I
> ust posted that includes the WAL files)

Let's keep the clone name for the client that makes it all then :)

>> That should be -D --pgdata, for consistency with pg_dump.
>
> pg_dump doesn't have a -D. I assume you mean pg_ctl / initdb?

Yes, sorry, been too fast.

> They are. The docs clearly say "Only one of <literal>-d</> and
> <literal>-t</> can be specified"

Really too fast…

> Another option, I think Heikki mentioned this on IM at some point, is
> to do something like name it <oid>-<name>.tar. That would give us best
> of both worlds?

Well I'd think we know the pg_tablespace columns encoding, so the
problem might be the filesystem encodings, right?  Well there's also the
option of creating <oid>.tar and have a symlink to it called <name>.tar
but that's pushing it.  I don't think naming after OIDs is a good
service for users, but if that's all we can reasonably do…

Will continue reviewing and post something more polished and
comprehensive next week — mainly wanted to see if you wanted to include
pg_ctl <command> in the patch already.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

16 January 2011, 10:32:00

On Sat, Jan 15, 2011 at 23:10, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
>>> That should be -D --pgdata, for consistency with pg_dump.
>>
>> pg_dump doesn't have a -D. I assume you mean pg_ctl / initdb?
>
> Yes, sorry, been too fast.

Ok. Updated patch that includes this change attached. I also changed
the tar directory from -t to -T, for consistency.

It also includes the change to take -h host, -U user, -w/-W for
password -p port instead of a conninfo string.


>> Another option, I think Heikki mentioned this on IM at some point, is
>> to do something like name it <oid>-<name>.tar. That would give us best
>> of both worlds?
>
> Well I'd think we know the pg_tablespace columns encoding, so the
> problem might be the filesystem encodings, right?  Well there's also the

Do we really? That's one of the global catalogs that don't really have
an encoding, isn't it?


> option of creating <oid>.tar and have a symlink to it called <name>.tar
> but that's pushing it.  I don't think naming after OIDs is a good
> service for users, but if that's all we can reasonably do…

Yeah, symlink seems to be making things way too complex. <oid>-<name>
seems is perhaps a reasonable compromise?


> Will continue reviewing and post something more polished and
> comprehensive next week — mainly wanted to see if you wanted to include
> pg_ctl <command> in the patch already.

Ok, thanks.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachment

pg_basebackup.patch

Re: pg_basebackup for streaming base backups

From

Tom Lane

Date:

16 January 2011, 13:19:03

Magnus Hagander <magnus@hagander.net> writes:
>> + * The file will be named base.tar[.gz] if it's for the main data directory
>> + * or <tablespaceoid>.tar[.gz] if it's for another tablespace.
>> 
>> Well we have UNIQUE, btree (spcname), so maybe we can use that here?

> We could, but that would make it more likely to run into encoding
> issues and such - do we restrict what can be in a tablespace name?

No.  Don't even think of going there --- we got rid of user-accessible
names in the filesystem years ago and we're not going back.  ConsiderCREATE TABLESPACE "/foo/bar" LOCATION '/foo/bar';
        regards, tom lane

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

16 January 2011, 13:21:12

On Sun, Jan 16, 2011 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>>> + * The file will be named base.tar[.gz] if it's for the main data directory
>>> + * or <tablespaceoid>.tar[.gz] if it's for another tablespace.
>>>
>>> Well we have UNIQUE, btree (spcname), so maybe we can use that here?
>
>> We could, but that would make it more likely to run into encoding
>> issues and such - do we restrict what can be in a tablespace name?
>
> No.  Don't even think of going there --- we got rid of user-accessible
> names in the filesystem years ago and we're not going back.  Consider
>        CREATE TABLESPACE "/foo/bar" LOCATION '/foo/bar';

Well, we'd try to name the file for that "<oid>-/foo/bar.tar", which I
guess would break badly, yes.

I guess we could normalize the tablespace name into [a-zA-Z0-9] or so,
which would still be useful for the majority of cases, I think?


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Dimitri Fontaine

Date:

16 January 2011, 13:54:09

Magnus Hagander <magnus@hagander.net> writes:
> On Sun, Jan 16, 2011 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> No.  Don't even think of going there --- we got rid of user-accessible
>> names in the filesystem years ago and we're not going back.  Consider
>>        CREATE TABLESPACE "/foo/bar" LOCATION '/foo/bar';
>
> Well, we'd try to name the file for that "<oid>-/foo/bar.tar", which I
> guess would break badly, yes.
>
> I guess we could normalize the tablespace name into [a-zA-Z0-9] or so,
> which would still be useful for the majority of cases, I think?

Well if we're not using user names, there's no good choice except for
system name, and the one you're making up here isn't the "true" one…

Now I think the unfriendliness is around the fact that you need to
prepare (untar, unzip) and start a cluster from the backup to be able to
know what file contains what.  Is it possible to offer a tool that lists
the logical objects contained into each tar file?

Maybe adding a special section at the beginning of each.  That would be
logically like pg_dump "catalog", but implemented as a simple "noise"
file that you simply `cat` with some command.

Once more, I'm still unclear how important that is, but it's scratching.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: pg_basebackup for streaming base backups

From

Tom Lane

Date:

16 January 2011, 13:59:46

Magnus Hagander <magnus@hagander.net> writes:
> Well, we'd try to name the file for that "<oid>-/foo/bar.tar", which I
> guess would break badly, yes.

> I guess we could normalize the tablespace name into [a-zA-Z0-9] or so,
> which would still be useful for the majority of cases, I think?

Just stick with the OID.  There's no reason that I can see to have
"friendly" names for these tarfiles --- in most cases, the DBA will
never even deal with them, no?
        regards, tom lane

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

16 January 2011, 14:00:54

On Sun, Jan 16, 2011 at 18:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Well, we'd try to name the file for that "<oid>-/foo/bar.tar", which I
>> guess would break badly, yes.
>
>> I guess we could normalize the tablespace name into [a-zA-Z0-9] or so,
>> which would still be useful for the majority of cases, I think?
>
> Just stick with the OID.  There's no reason that I can see to have
> "friendly" names for these tarfiles --- in most cases, the DBA will
> never even deal with them, no?

No, this is the output mode where the DBA chooses to get the output in
the form of tarfiles. So if chosen, he will definitely deal with it.

When we unpack the tars right away to a directory, they have no name,
so that doesn't apply here.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Tom Lane

Date:

16 January 2011, 14:03:33

Magnus Hagander <magnus@hagander.net> writes:
> On Sun, Jan 16, 2011 at 18:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Just stick with the OID. �There's no reason that I can see to have
>> "friendly" names for these tarfiles --- in most cases, the DBA will
>> never even deal with them, no?

> No, this is the output mode where the DBA chooses to get the output in
> the form of tarfiles. So if chosen, he will definitely deal with it.

Mph.  How big a use-case has that got?  Offhand I can't see a reason to
use it at all, ever.  If you're trying to set up a clone you want the
files unpacked.
        regards, tom lane

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

16 January 2011, 14:04:27

On Sun, Jan 16, 2011 at 19:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> On Sun, Jan 16, 2011 at 18:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Just stick with the OID.  There's no reason that I can see to have
>>> "friendly" names for these tarfiles --- in most cases, the DBA will
>>> never even deal with them, no?
>
>> No, this is the output mode where the DBA chooses to get the output in
>> the form of tarfiles. So if chosen, he will definitely deal with it.
>
> Mph.  How big a use-case has that got?  Offhand I can't see a reason to
> use it at all, ever.  If you're trying to set up a clone you want the
> files unpacked.

Yes, but the tool isn't just for setting up a clone.

If you're doing a regular base backup, that's *not* for replication,
you might want them in files.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Dimitri Fontaine

Date:

16 January 2011, 14:21:44

Magnus Hagander <magnus@hagander.net> writes:
> If you're doing a regular base backup, that's *not* for replication,
> you might want them in files.

+1

So, is that pg_restore -l idea feasible with your current tar format?  I
guess that would translate to pg_basebackup -l <directory>|<oid>.tar.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

16 January 2011, 14:22:47

On Sun, Jan 16, 2011 at 19:21, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> If you're doing a regular base backup, that's *not* for replication,
>> you might want them in files.
>
> +1
>
> So, is that pg_restore -l idea feasible with your current tar format?  I
> guess that would translate to pg_basebackup -l <directory>|<oid>.tar.

Um, not easily if you want to translate it to names. Just like you
don't have access to the oid->name mapping without the server started.
The walsender can't read pg_class for example, so it can't generate
that mapping file.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

17 January 2011, 04:16:36

On Sun, Jan 16, 2011 at 11:31 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Ok. Updated patch that includes this change attached.

I could not apply the patch cleanly against the git master.
Do you know what the cause is?

$ patch -p1 -d. < /hoge/pg_basebackup.patch
patching file doc/src/sgml/backup.sgml
patching file doc/src/sgml/ref/allfiles.sgml
patching file doc/src/sgml/ref/pg_basebackup.sgml
patching file doc/src/sgml/reference.sgml
patching file src/bin/Makefile
patching file src/bin/pg_basebackup/Makefile
patching file src/bin/pg_basebackup/nls.mk
patching file src/bin/pg_basebackup/pg_basebackup.c
patch: **** malformed patch at line 1428: diff --git
a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

17 January 2011, 04:44:22

<p><br /> On Jan 17, 2011 9:16 AM, "Fujii Masao" <<a
href="mailto:masao.fujii@gmail.com">masao.fujii@gmail.com</a>>wrote:<br /> ><br /> > On Sun, Jan 16, 2011 at
11:31PM, Magnus Hagander <<a href="mailto:magnus@hagander.net">magnus@hagander.net</a>> wrote:<br /> > >
Ok.Updated patch that includes this change attached.<br /> ><br /> > I could not apply the patch cleanly against
thegit master.<br /> > Do you know what the cause is?<br /> ><br /> > $ patch -p1 -d. <
/hoge/pg_basebackup.patch<br/> > patching file doc/src/sgml/backup.sgml<br /> > patching file
doc/src/sgml/ref/allfiles.sgml<br/> > patching file doc/src/sgml/ref/pg_basebackup.sgml<br /> > patching file
doc/src/sgml/reference.sgml<br/> > patching file src/bin/Makefile<br /> > patching file
src/bin/pg_basebackup/Makefile<br/> > patching file src/bin/pg_basebackup/<a href="http://nls.mk">nls.mk</a><br />
>patching file src/bin/pg_basebackup/pg_basebackup.c<br /> > patch: **** malformed patch at line 1428: diff
--git<br/> > a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm<br /> ><br /><p>Weird, no idea. Will
haveto look into that later - meanwhile you can grab the branch tip from my github repo if you want to review it.
<p>/Magnus<br/>

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

17 January 2011, 04:51:03

On Mon, Jan 17, 2011 at 5:44 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Weird, no idea. Will have to look into that later - meanwhile you can grab
> the branch tip from my github repo if you want to review it.

Which repo should I grab? You seem to have many repos :)
http://git.postgresql.org/gitweb

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

17 January 2011, 05:54:01

On Mon, Jan 17, 2011 at 09:50, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Jan 17, 2011 at 5:44 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> Weird, no idea. Will have to look into that later - meanwhile you can grab
>> the branch tip from my github repo if you want to review it.
>
> Which repo should I grab? You seem to have many repos :)
> http://git.postgresql.org/gitweb

Oh, sorry about that. There is only one that contains postgresql though :P

http://github.com/mhagander/postgres, branch streaming_base.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

17 January 2011, 06:14:51

On Mon, Jan 17, 2011 at 6:53 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Mon, Jan 17, 2011 at 09:50, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Mon, Jan 17, 2011 at 5:44 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>> Weird, no idea. Will have to look into that later - meanwhile you can grab
>>> the branch tip from my github repo if you want to review it.
>>
>> Which repo should I grab? You seem to have many repos :)
>> http://git.postgresql.org/gitweb
>
> Oh, sorry about that. There is only one that contains postgresql though :P
>
> http://github.com/mhagander/postgres, branch streaming_base.

Thanks!

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Peter Eisentraut

Date:

17 January 2011, 08:39:06

On lör, 2011-01-15 at 19:10 +0100, Magnus Hagander wrote:
> This patch creates pg_basebackup in bin/, being a client program for
> the streaming base backup feature.
> 
> I think it's more or less done now. I've again split it out of
> pg_streamrecv, because it had very little shared code with that
> (basically just the PQconnectdb() wrapper).
> 
> One thing I'm thinking about - right now the tool just takes -c
> <conninfo> to connect to the database. Should it instead be taught to
> take the connection parameters that for example pg_dump does - one for
> each of host, port, user, password? (shouldn't be hard to do..)

Probably yes, for consistency.  I have been thinking for a while,
however, that it would be very good if the tools also supported a
conninfo string, so you don't have to invent a new option for every new
connection option.  psql already supports that.

Some other comments:

I had trouble at first interpreting the documentation.  In particular,
where does the data come from, and where does it go to?  -d speaks of
restoring, but I was just looking for making a backup, not restoring it.
Needs some clarification, and some complete examples.  Also what happens
if -c, or -d and -t are omitted.

Downthread you say that this tool is also useful for making base backups
independent of replication functionality.  Sounds good.  But then the
documentation says that the connection must be with a user that has the
replication permission.  Something is conceptually wrong here: why would
I have to grant replication permission just to take a base backup for
the purpose of making a backup?

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

17 January 2011, 08:43:20

On Mon, Jan 17, 2011 at 13:38, Peter Eisentraut <peter_e@gmx.net> wrote:
> On lör, 2011-01-15 at 19:10 +0100, Magnus Hagander wrote:
>> This patch creates pg_basebackup in bin/, being a client program for
>> the streaming base backup feature.
>>
>> I think it's more or less done now. I've again split it out of
>> pg_streamrecv, because it had very little shared code with that
>> (basically just the PQconnectdb() wrapper).
>>
>> One thing I'm thinking about - right now the tool just takes -c
>> <conninfo> to connect to the database. Should it instead be taught to
>> take the connection parameters that for example pg_dump does - one for
>> each of host, port, user, password? (shouldn't be hard to do..)
>
> Probably yes, for consistency.  I have been thinking for a while,
> however, that it would be very good if the tools also supported a
> conninfo string, so you don't have to invent a new option for every new
> connection option.  psql already supports that.

libpq has an option to expand a connection string if it's passed in
the database name, it seems. The problem is that this is done on the
dbname parameter - won't work in pg_dump for example, without special
treatment, since the db name is used in the client there.


> Some other comments:
>
> I had trouble at first interpreting the documentation.  In particular,
> where does the data come from, and where does it go to?  -d speaks of
> restoring, but I was just looking for making a backup, not restoring it.
> Needs some clarification, and some complete examples.  Also what happens
> if -c, or -d and -t are omitted.

You get an error. (not with -c anymore)

I'll look at adding some further clarifications to it. Concrete
suggestions from you or others are of course also appreciated :-)


> Downthread you say that this tool is also useful for making base backups
> independent of replication functionality.  Sounds good.  But then the
> documentation says that the connection must be with a user that has the
> replication permission.  Something is conceptually wrong here: why would
> I have to grant replication permission just to take a base backup for
> the purpose of making a backup?

It uses the replication features for it. You also have to set
max_walsenders > 0, which is in the replication section of the
postgresql.conf file.

The point I wanted to make downthread was that it's useful without
having a replication *slave*. But yes, you need the master.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

17 January 2011, 09:30:44

On Mon, Jan 17, 2011 at 7:14 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> Oh, sorry about that. There is only one that contains postgresql though :P
>>
>> http://github.com/mhagander/postgres, branch streaming_base.
>
> Thanks!

Though I haven't seen the core part of the patch (i.e.,
ReceiveTarFile, etc..) yet,
here is the comments against others.


+        if (strcmp(argv[1], "-h") == 0 || strcmp(argv[1], "--help") == 0 ||
+            strcmp(argv[1], "-?") == 0)

strcmp(argv[1], "-h") should be removed


+    printf(_("  -p, --progress            show progress information\n"));

-p needs to be changed to -P


+    printf(_("  -D, --pgdata=directory   receive base backup into directory\n"));
+    printf(_("  -T, --tardir=directory    receive base backup into tar files\n"
+             "                            stored in specified directory\n"));
+    printf(_("  -Z, --compress=0-9        compress tar output\n"));
+    printf(_("  -l, --label=label         set backup label\n"));
+    printf(_("  -p, --progress            show progress information\n"));
+    printf(_("  -v, --verbose             output verbose messages\n"));

Can we list those options in alphabetical order as other tools do?

Like -f and -F option of pg_dump, it's more intuitive to provide one option for
output directory and that for format. Something like
   -d directory   --dir=directory
   -F format   --format=format
   p   plain
   t   tar


+            case 'p':
+                if (atoi(optarg) == 0)
+                {
+                    fprintf(stderr, _("%s: invalid port number \"%s\""),
+                            progname, optarg);
+                    exit(1);
+                }

Shouldn't we emit an error message when the result of atoi is *less than* or
equal to 0? \n should be in the tail of the error message. Is this error check
really required here? IIRC libpq does. If it's required, atoi for compresslevel
should also be error-checked.


+            case 'v':
+                verbose++;

Why is the verbose defined as integer?


+    if (optind < argc)
+    {
+        fprintf(stderr,
+                _("%s: too many command-line arguments (first is \"%s\")\n"),
+                progname, argv[optind + 1]);

You need to reference to argv[optind] instead.

What about using PGDATA environment variable when no target directory is
specified?


+ * Verify that the given directory exists and is empty. If it does not
+ * exist, it is created. If it exists but is not empty, an error will
+ * be give and the process ended.
+ */
+static void
+verify_dir_is_empty_or_create(char *dirname)

Since verify_dir_is_empty_or_create can be called after the connection has
been established, it should call PQfinish before exit(1).


+    keywords[2] = "fallback_application_name";
+    values[2] = "pg_basebackup";

Using the progname variable seems better rather than the fixed word
"pg_basebackup".


+        if (dbgetpassword == 1)
+        {
+            /* Prompt for a password */
+            password = simple_prompt(_("Password: "), 100, false);

PQfinish should be called for the password retry case.


+        if (PQstatus(conn) != CONNECTION_OK)
+        {
+            fprintf(stderr, _("%s: could not connect to server: %s\n"),
+                    progname, PQerrorMessage(conn));
+            exit(1);
+        }

PQfinish seems required before exit(1).


+    if (PQsendQuery(conn, current_path) == 0)
+    {
+        fprintf(stderr, _("%s: coult not start base backup: %s\n"),

Typo: s/coult/could


+    /*
+     * Get the header
+     */
+    res = PQgetResult(conn);

After this, PQclear seems required before each exit(1) call.


+    if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
+    {
+        fprintf(stderr, _("%s: final receive failed: %s\n"),
+                progname, PQerrorMessage(conn));
+        exit(1);
+    }

PQfinish seems required before exit(1).

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

17 January 2011, 09:44:31

On Mon, Jan 17, 2011 at 9:43 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> Probably yes, for consistency.  I have been thinking for a while,
>> however, that it would be very good if the tools also supported a
>> conninfo string, so you don't have to invent a new option for every new
>> connection option.  psql already supports that.
>
> libpq has an option to expand a connection string if it's passed in
> the database name, it seems. The problem is that this is done on the
> dbname parameter - won't work in pg_dump for example, without special
> treatment, since the db name is used in the client there.

If conninfo is specified, you can just append the "dbname=replication"
into it and pass it to libpq as dbname. I don't think that supporting
conninfo is difficult.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Simon Riggs

Date:

17 January 2011, 09:50:12

On Mon, 2011-01-17 at 13:43 +0100, Magnus Hagander wrote:
> Downthread you say that this tool is also useful for making base
> backups
> > independent of replication functionality.  Sounds good.  But then
> the
> > documentation says that the connection must be with a user that has
> the
> > replication permission.  Something is conceptually wrong here: why
> would
> > I have to grant replication permission just to take a base backup
> for
> > the purpose of making a backup?
> 
> It uses the replication features for it. You also have to set
> max_walsenders > 0, which is in the replication section of the
> postgresql.conf file.
> 
> The point I wanted to make downthread was that it's useful without
> having a replication *slave*. But yes, you need the master.

Suggest calling this utility pg_replication_backup
and the other utility pg_replication_stream

It will be easier to explain the connection with replication.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

17 January 2011, 09:50:30

On Mon, Jan 17, 2011 at 14:30, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Jan 17, 2011 at 7:14 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> Oh, sorry about that. There is only one that contains postgresql though :P
>>>
>>> http://github.com/mhagander/postgres, branch streaming_base.
>>
>> Thanks!
>
> Though I haven't seen the core part of the patch (i.e.,
> ReceiveTarFile, etc..) yet,
> here is the comments against others.
>
>
> +               if (strcmp(argv[1], "-h") == 0 || strcmp(argv[1], "--help") == 0 ||
> +                       strcmp(argv[1], "-?") == 0)
>
> strcmp(argv[1], "-h") should be removed

Oh, crap. From the addition of -h for host. oopsie.

> +       printf(_("  -p, --progress            show progress information\n"));
>
> -p needs to be changed to -P

Indeed.

> +       printf(_("  -D, --pgdata=directory   receive base backup into directory\n"));
> +       printf(_("  -T, --tardir=directory    receive base backup into tar files\n"
> +                        "                            stored in specified directory\n"));
> +       printf(_("  -Z, --compress=0-9        compress tar output\n"));
> +       printf(_("  -l, --label=label         set backup label\n"));
> +       printf(_("  -p, --progress            show progress information\n"));
> +       printf(_("  -v, --verbose             output verbose messages\n"));
>
> Can we list those options in alphabetical order as other tools do?

Certainly. But it makes more sense to have -D and -T next to each
other, I think - they'd end up way elsewhere. Perhaps we need a group
taht says "target"?


> Like -f and -F option of pg_dump, it's more intuitive to provide one option for
> output directory and that for format. Something like
>
>    -d directory
>    --dir=directory
>
>    -F format
>    --format=format
>
>    p
>    plain
>
>    t
>    tar

That's another option. It would certainly make for more consistency -
probably a better idea.

> +                       case 'p':
> +                               if (atoi(optarg) == 0)
> +                               {
> +                                       fprintf(stderr, _("%s: invalid port number \"%s\""),
> +                                                       progname, optarg);
> +                                       exit(1);
> +                               }
>
> Shouldn't we emit an error message when the result of atoi is *less than* or
> equal to 0? \n should be in the tail of the error message. Is this error check
> really required here? IIRC libpq does. If it's required, atoi for compresslevel
> should also be error-checked.

Yes on all.


> +                       case 'v':
> +                               verbose++;
>
> Why is the verbose defined as integer?

I envisioned multiple level of verbosity (which I had in
pg_streamrecv), where multiple -v's would add things.

> +       if (optind < argc)
> +       {
> +               fprintf(stderr,
> +                               _("%s: too many command-line arguments (first is \"%s\")\n"),
> +                               progname, argv[optind + 1]);
>
> You need to reference to argv[optind] instead.

Check. Copy/paste:o.


> What about using PGDATA environment variable when no target directory is
> specified?

Hmm. I don't really like that. I prefer requiring it to be specified.


> + * Verify that the given directory exists and is empty. If it does not
> + * exist, it is created. If it exists but is not empty, an error will
> + * be give and the process ended.
> + */
> +static void
> +verify_dir_is_empty_or_create(char *dirname)
>
> Since verify_dir_is_empty_or_create can be called after the connection has
> been established, it should call PQfinish before exit(1).

Yeah, that's a general thing - do we need to actually bother doing
that? At most exit() places I haven't bothered free:ing memory or
closing the connection.

It's not just the PQclear() that you refer to further down - it's also
all the xstrdup()ed strings for example. Do we really need to care
about those before we do exit(1)? I think not.

Requiring PQfinish() might be more reasonable since it will give you a
log on the server if you don't, but I'm not convinced that's necessary
either?


<snip similar requirements>


> +       keywords[2] = "fallback_application_name";
> +       values[2] = "pg_basebackup";
>
> Using the progname variable seems better rather than the fixed word
> "pg_basebackup".

I don't think so - that turns into argv[0], which means that if you
use the full path of the program (/usr/local/pgsql/bin/pg_basebackup
for example), the entire path goes into fallback_application_name -
not just the program name.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

17 January 2011, 09:55:05

On Mon, Jan 17, 2011 at 14:43, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Jan 17, 2011 at 9:43 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>> Probably yes, for consistency.  I have been thinking for a while,
>>> however, that it would be very good if the tools also supported a
>>> conninfo string, so you don't have to invent a new option for every new
>>> connection option.  psql already supports that.
>>
>> libpq has an option to expand a connection string if it's passed in
>> the database name, it seems. The problem is that this is done on the
>> dbname parameter - won't work in pg_dump for example, without special
>> treatment, since the db name is used in the client there.
>
> If conninfo is specified, you can just append the "dbname=replication"
> into it and pass it to libpq as dbname. I don't think that supporting
> conninfo is difficult.

Yeah, it's easy enough for pg_basebackup. But not for example for
pg_dump. (I meant for being able to use it more or less with zero
modification to the current code - it can certainly be adapted to be
able to deal with it)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

17 January 2011, 09:55:42

On Mon, Jan 17, 2011 at 14:49, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2011-01-17 at 13:43 +0100, Magnus Hagander wrote:
>> Downthread you say that this tool is also useful for making base
>> backups
>> > independent of replication functionality.  Sounds good.  But then
>> the
>> > documentation says that the connection must be with a user that has
>> the
>> > replication permission.  Something is conceptually wrong here: why
>> would
>> > I have to grant replication permission just to take a base backup
>> for
>> > the purpose of making a backup?
>>
>> It uses the replication features for it. You also have to set
>> max_walsenders > 0, which is in the replication section of the
>> postgresql.conf file.
>>
>> The point I wanted to make downthread was that it's useful without
>> having a replication *slave*. But yes, you need the master.
>
> Suggest calling this utility pg_replication_backup
> and the other utility pg_replication_stream
>
> It will be easier to explain the connection with replication.

Hmm. I don't like those names at all :(

But that's just me - and I don't totally hate them. So I'm opening the
floor for other peoples votes :-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Simon Riggs

Date:

17 January 2011, 10:02:15

On Mon, 2011-01-17 at 14:55 +0100, Magnus Hagander wrote:
> >>
> >> It uses the replication features for it. You also have to set
> >> max_walsenders > 0, which is in the replication section of the
> >> postgresql.conf file.
> >>
> >> The point I wanted to make downthread was that it's useful without
> >> having a replication *slave*. But yes, you need the master.
> >
> > Suggest calling this utility pg_replication_backup
> > and the other utility pg_replication_stream
> >
> > It will be easier to explain the connection with replication.
> 
> Hmm. I don't like those names at all :(
> 
> But that's just me - and I don't totally hate them. So I'm opening the
> floor for other peoples votes :-)

No problem. My point is that we should look for a name that illustrates
the function more clearly. If Peter was confused, others will be also.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services

Re: pg_basebackup for streaming base backups

From

Robert Haas

Date:

17 January 2011, 11:18:56

On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote:
> Hmm. I don't like those names at all :(

I agree.  I don't think your original names are bad, as long as
they're well-documented.  I sympathize with Simon's desire to make it
clear that these use the replication framework, but I really don't
want the command names to be that long.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

17 January 2011, 11:21:05

On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> Hmm. I don't like those names at all :(
>
> I agree.  I don't think your original names are bad, as long as
> they're well-documented.  I sympathize with Simon's desire to make it
> clear that these use the replication framework, but I really don't
> want the command names to be that long.

Actually, after some IM chats, I think pg_streamrecv should be
renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
is a lot more specific than that)


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Simon Riggs

Date:

17 January 2011, 11:27:55

On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote:
> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote:
> >> Hmm. I don't like those names at all :(
> >
> > I agree.  I don't think your original names are bad, as long as
> > they're well-documented.  I sympathize with Simon's desire to make it
> > clear that these use the replication framework, but I really don't
> > want the command names to be that long.
> 
> Actually, after some IM chats, I think pg_streamrecv should be
> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
> is a lot more specific than that)

pg_stream_log
pg_stream_backup
?

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services

Re: pg_basebackup for streaming base backups

From

Dimitri Fontaine

Date:

17 January 2011, 13:15:34

Magnus Hagander <magnus@hagander.net> writes:
> The walsender can't read pg_class for example, so it can't generate
> that mapping file.

I don't see any way out here.  So let's call <oid>.tar good enough for now…

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: pg_basebackup for streaming base backups

From

Tom Lane

Date:

17 January 2011, 15:31:56

Magnus Hagander <magnus@hagander.net> writes:
> Actually, after some IM chats, I think pg_streamrecv should be
> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
> is a lot more specific than that)

What I like about streamrecv is it's fairly clear which end of the
connection it's supposed to be used on.  I find "pg_basebackup"
quite lacking from that perspective, and the same for the names
above.  Or are you proposing to merge the send and receive sides
into one executable?
        regards, tom lane

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

17 January 2011, 15:32:43

On Mon, Jan 17, 2011 at 20:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Actually, after some IM chats, I think pg_streamrecv should be
>> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
>> is a lot more specific than that)
>
> What I like about streamrecv is it's fairly clear which end of the
> connection it's supposed to be used on.  I find "pg_basebackup"
> quite lacking from that perspective, and the same for the names
> above.  Or are you proposing to merge the send and receive sides
> into one executable?

No, the sending side is in walsender.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

17 January 2011, 22:15:08

On Mon, Jan 17, 2011 at 10:50 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> +       printf(_("  -D, --pgdata=directory   receive base backup into directory\n"));
>> +       printf(_("  -T, --tardir=directory    receive base backup into tar files\n"
>> +                        "                            stored in specified directory\n"));
>> +       printf(_("  -Z, --compress=0-9        compress tar output\n"));
>> +       printf(_("  -l, --label=label         set backup label\n"));
>> +       printf(_("  -p, --progress            show progress information\n"));
>> +       printf(_("  -v, --verbose             output verbose messages\n"));
>>
>> Can we list those options in alphabetical order as other tools do?
>
> Certainly. But it makes more sense to have -D and -T next to each
> other, I think - they'd end up way elsewhere. Perhaps we need a group
> taht says "target"?

I agree with you if we end up choosing -D and -T for a target rather
than pg_dump-like options I proposed.

>> + * Verify that the given directory exists and is empty. If it does not
>> + * exist, it is created. If it exists but is not empty, an error will
>> + * be give and the process ended.
>> + */
>> +static void
>> +verify_dir_is_empty_or_create(char *dirname)
>>
>> Since verify_dir_is_empty_or_create can be called after the connection has
>> been established, it should call PQfinish before exit(1).
>
> Yeah, that's a general thing - do we need to actually bother doing
> that? At most exit() places I haven't bothered free:ing memory or
> closing the connection.
>
> It's not just the PQclear() that you refer to further down - it's also
> all the xstrdup()ed strings for example. Do we really need to care
> about those before we do exit(1)? I think not.

Probably true. The allocated memory would be free'd right after
exit.

> Requiring PQfinish() might be more reasonable since it will give you a
> log on the server if you don't, but I'm not convinced that's necessary
> either?

At least it's required for each password-retry. Otherwise, previous
connections remain during backup. You can see this problem easily
by repeating the password input in pg_basebackup.
   LOG:  could not send data to client: Connection reset by peer   LOG:  could not send data to client: Broken pipe
FATAL: base backup could not send data, aborting backup 

As you said, if PQfinish is not called at exit(1), the above messages
would be output. Those messages look ugly and should be
suppressed whenever we *can*. Also I believe other tools would
do that.

>> +       keywords[2] = "fallback_application_name";
>> +       values[2] = "pg_basebackup";
>>
>> Using the progname variable seems better rather than the fixed word
>> "pg_basebackup".
>
> I don't think so - that turns into argv[0], which means that if you
> use the full path of the program (/usr/local/pgsql/bin/pg_basebackup
> for example), the entire path goes into fallback_application_name -
> not just the program name.

No. get_progname extracts the actual name.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

18 January 2011, 05:50:04

On Mon, Jan 17, 2011 at 10:30 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> Though I haven't seen the core part of the patch (i.e.,
> ReceiveTarFile, etc..) yet,
> here is the comments against others.

Here are another comments:

When I untar the tar file taken by pg_basebackup, I got the following
messages:
   $ tar xf base.tar   tar: Skipping to next header   tar: Archive contains obsolescent base-64 headers   tar: Error
exitdelayed from previous errors

Is this a bug? This happens only when I create $PGDATA by using
initdb -X (i.e., I relocated the pg_xlog directory elsewhere than
$PGDATA).

+        if (compresslevel > 0)
+        {
+            snprintf(fn, sizeof(fn), "%s/%s.tar.gz", tardir, PQgetvalue(res,
rownum, 0));
+            ztarfile = gzopen(fn, "wb");

Though I'm not familiar with zlib, isn't gzsetparams() required
here?

+#ifdef HAVE_LIBZ
+    if (!tarfile && !ztarfile)
+#else
+    if (!tarfile)
+#endif
+    {
+        fprintf(stderr, _("%s: could not create file \"%s\": %s\n"),
+                progname, fn, strerror(errno));

Instead of strerror, get_gz_error seems required when using zlib.

+    if (!res || PQresultStatus(res) != PGRES_COPY_OUT)

The check for "!res" is not needed since PQresultStatus checks that.

+        r = PQgetCopyData(conn, ©buf, 0);
+        if (r == -1)

Since -1 of PQgetCopyData might indicate an error, in this case,
we would need to call PQgetResult?.

ReceiveTarFile seems refactorable by using GZWRITE and GZCLOSE
macros.

+                    fprintf(stderr, _("%s: could not write to file '%s': %m\n"),

%m in fprintf is portable?

Can't you change '%s' to \"%s\" for consistency?

+    /*
+     * Make sure we're unpacking into an empty directory
+     */
+    verify_dir_is_empty_or_create(current_path);

Can pg_basebackup take a backup of $PGDATA including a tablespace
directory, without an error? The above code seems to prevent that....

+                if (compresslevel <= 0)
+                {
+                    fprintf(stderr, _("%s: invalid compression level \"%s\"\n"),

It's better to check "compresslevel > 9" here.

+/*
+ * Print a progress report based on the global variables. If verbose output
+ * is disabled, also print the current file name.

Typo: s/disabled/enabled

I request new option which specifies whether pg_start_backup
executes immediate checkpoint or not. Currently it always executes
immediate one. But I'd like to run smoothed one in busy system.
What's your opinion?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Cédric Villemain

Date:

18 January 2011, 06:33:41

2011/1/18 Fujii Masao <masao.fujii@gmail.com>:
> On Mon, Jan 17, 2011 at 10:30 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> Though I haven't seen the core part of the patch (i.e.,
>> ReceiveTarFile, etc..) yet,
>> here is the comments against others.
>
> Here are another comments:
>
>
> When I untar the tar file taken by pg_basebackup, I got the following
> messages:
>
>    $ tar xf base.tar
>    tar: Skipping to next header
>    tar: Archive contains obsolescent base-64 headers
>    tar: Error exit delayed from previous errors
>
> Is this a bug? This happens only when I create $PGDATA by using
> initdb -X (i.e., I relocated the pg_xlog directory elsewhere than
> $PGDATA).
>
>
> +               if (compresslevel > 0)
> +               {
> +                       snprintf(fn, sizeof(fn), "%s/%s.tar.gz", tardir, PQgetvalue(res,
> rownum, 0));
> +                       ztarfile = gzopen(fn, "wb");
>
> Though I'm not familiar with zlib, isn't gzsetparams() required
> here?
>
>
> +#ifdef HAVE_LIBZ
> +       if (!tarfile && !ztarfile)
> +#else
> +       if (!tarfile)
> +#endif
> +       {
> +               fprintf(stderr, _("%s: could not create file \"%s\": %s\n"),
> +                               progname, fn, strerror(errno));
>
> Instead of strerror, get_gz_error seems required when using zlib.
>
>
> +       if (!res || PQresultStatus(res) != PGRES_COPY_OUT)
>
> The check for "!res" is not needed since PQresultStatus checks that.
>
>
> +               r = PQgetCopyData(conn, ©buf, 0);
> +               if (r == -1)
>
> Since -1 of PQgetCopyData might indicate an error, in this case,
> we would need to call PQgetResult?.
>
>
> ReceiveTarFile seems refactorable by using GZWRITE and GZCLOSE
> macros.
>
>
> +                                       fprintf(stderr, _("%s: could not write to file '%s': %m\n"),
>
> %m in fprintf is portable?
>
>
> Can't you change '%s' to \"%s\" for consistency?
>
>
> +       /*
> +        * Make sure we're unpacking into an empty directory
> +        */
> +       verify_dir_is_empty_or_create(current_path);
>
> Can pg_basebackup take a backup of $PGDATA including a tablespace
> directory, without an error? The above code seems to prevent that....
>
>
> +                               if (compresslevel <= 0)
> +                               {
> +                                       fprintf(stderr, _("%s: invalid compression level \"%s\"\n"),
>
> It's better to check "compresslevel > 9" here.
>
>
> +/*
> + * Print a progress report based on the global variables. If verbose output
> + * is disabled, also print the current file name.
>
> Typo: s/disabled/enabled
>
>
> I request new option which specifies whether pg_start_backup
> executes immediate checkpoint or not. Currently it always executes
> immediate one. But I'd like to run smoothed one in busy system.
> What's your opinion?
>

*if* it is possible, this is welcome, the checkpoint hit due to
pg_start_backup is visible, even outside pg_basebasckup.
(it sync everything then it blast cache memory)

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

18 January 2011, 07:21:18

On Tue, Jan 18, 2011 at 03:14, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Jan 17, 2011 at 10:50 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>> +       printf(_("  -D, --pgdata=directory   receive base backup into directory\n"));
>>> +       printf(_("  -T, --tardir=directory    receive base backup into tar files\n"
>>> +                        "                            stored in specified directory\n"));
>>> +       printf(_("  -Z, --compress=0-9        compress tar output\n"));
>>> +       printf(_("  -l, --label=label         set backup label\n"));
>>> +       printf(_("  -p, --progress            show progress information\n"));
>>> +       printf(_("  -v, --verbose             output verbose messages\n"));
>>>
>>> Can we list those options in alphabetical order as other tools do?
>>
>> Certainly. But it makes more sense to have -D and -T next to each
>> other, I think - they'd end up way elsewhere. Perhaps we need a group
>> taht says "target"?
>
> I agree with you if we end up choosing -D and -T for a target rather
> than pg_dump-like options I proposed.

Yeah. What do others think between tohse two options? -D/-T followed
by directory, or -D <dir> and -F<format>?


>> Requiring PQfinish() might be more reasonable since it will give you a
>> log on the server if you don't, but I'm not convinced that's necessary
>> either?
>
> At least it's required for each password-retry. Otherwise, previous
> connections remain during backup. You can see this problem easily

Oh yeah, I've put that one in my git branch already.

> by repeating the password input in pg_basebackup.
>
>    LOG:  could not send data to client: Connection reset by peer
>    LOG:  could not send data to client: Broken pipe
>    FATAL:  base backup could not send data, aborting backup
>
> As you said, if PQfinish is not called at exit(1), the above messages
> would be output. Those messages look ugly and should be
> suppressed whenever we *can*. Also I believe other tools would
> do that.

Yeah, agreed. I'll add that, shouldn't be too hard.


>>> +       keywords[2] = "fallback_application_name";
>>> +       values[2] = "pg_basebackup";
>>>
>>> Using the progname variable seems better rather than the fixed word
>>> "pg_basebackup".
>>
>> I don't think so - that turns into argv[0], which means that if you
>> use the full path of the program (/usr/local/pgsql/bin/pg_basebackup
>> for example), the entire path goes into fallback_application_name -
>> not just the program name.
>
> No. get_progname extracts the actual name.

Hmm. I see it does. I wonder what I did to make that not work.

Then I agree with the change :-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

18 January 2011, 07:41:01

On Tue, Jan 18, 2011 at 10:49, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Jan 17, 2011 at 10:30 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> Though I haven't seen the core part of the patch (i.e.,
>> ReceiveTarFile, etc..) yet,
>> here is the comments against others.
>
> Here are another comments:

Thanks! These are all good and useful comments!

> When I untar the tar file taken by pg_basebackup, I got the following
> messages:
>
>    $ tar xf base.tar
>    tar: Skipping to next header
>    tar: Archive contains obsolescent base-64 headers
>    tar: Error exit delayed from previous errors
>
> Is this a bug? This happens only when I create $PGDATA by using
> initdb -X (i.e., I relocated the pg_xlog directory elsewhere than
> $PGDATA).

Interesting. What version of tar and what platform? I can't reproduce
that here...

It certainly is a bug, that should not happen.

> +               if (compresslevel > 0)
> +               {
> +                       snprintf(fn, sizeof(fn), "%s/%s.tar.gz", tardir, PQgetvalue(res,
> rownum, 0));
> +                       ztarfile = gzopen(fn, "wb");
>
> Though I'm not familiar with zlib, isn't gzsetparams() required
> here?

Uh. It certainly is! I clearly forgot it there...

> +#ifdef HAVE_LIBZ
> +       if (!tarfile && !ztarfile)
> +#else
> +       if (!tarfile)
> +#endif
> +       {
> +               fprintf(stderr, _("%s: could not create file \"%s\": %s\n"),
> +                               progname, fn, strerror(errno));
>
> Instead of strerror, get_gz_error seems required when using zlib.

Indeed it is. I think it needs to be this:
#ifdef HAVE_LIBZif (compresslevel > 0 && !ztarfile){    /* Compression is in use */    fprintf(stderr, _("%s: could not
createcompressed file \"%s\": %s\n"),            progname, fn, get_gz_error(ztarfile));    exit(1);}else 
#endif{    /* Either no zlib support, or zlib support but compresslevel = 0 */    if (!tarfile)    {
fprintf(stderr,_("%s: could not create file \"%s\": %s\n"),                progname, fn, strerror(errno));
exit(1);   }} 

> +       if (!res || PQresultStatus(res) != PGRES_COPY_OUT)
>
> The check for "!res" is not needed since PQresultStatus checks that.

Hah. I still keep doing that from old habit. I know you've pointed
that out before, with libpqwalreceiver :-)

> +               r = PQgetCopyData(conn, ©buf, 0);
> +               if (r == -1)
>
> Since -1 of PQgetCopyData might indicate an error, in this case,
> we would need to call PQgetResult?.

Uh, -1 means end of data, no? -2 means error?

> ReceiveTarFile seems refactorable by using GZWRITE and GZCLOSE
> macros.

You mean the ones from pg_dump? I don't think so. We can't use
gzwrite() with compression level 0 on the tar output, because it will
still write a gz header. With pg_dump, that is ok because it's our
format, but with a .tar (without .gz) I don't think it is.

at least that's how I interpreted the function.

> +                                       fprintf(stderr, _("%s: could not write to file '%s': %m\n"),
>
> %m in fprintf is portable?

Hmm. I just assumed it was because we use it elsewhere, but I now see
we only really use it for ereport() stuff. Bottom line is, I don't
know - perhaps it needs to be changed to use strerror()?

> Can't you change '%s' to \"%s\" for consistency?

Yeah, absolutely. Clearly I was way inconsistent, and it got worse
with some copy/paste :-(

> +       /*
> +        * Make sure we're unpacking into an empty directory
> +        */
> +       verify_dir_is_empty_or_create(current_path);
>
> Can pg_basebackup take a backup of $PGDATA including a tablespace
> directory, without an error? The above code seems to prevent that....

Uh, how do you mean it woul dprevent that? It requires that the
directory you're writing the tablespace to is empty or nonexistant,
but that shouldn't prevent a backup, no? It will prevent you from
overwriting things with your backup, but that's intentional - if you
don't need the old dir, just remove it.

> +                               if (compresslevel <= 0)
> +                               {
> +                                       fprintf(stderr, _("%s: invalid compression level \"%s\"\n"),
>
> It's better to check "compresslevel > 9" here.

Agreed (well, check for both of them of course)

> +/*
> + * Print a progress report based on the global variables. If verbose output
> + * is disabled, also print the current file name.
>
> Typo: s/disabled/enabled

Indeed.

> I request new option which specifies whether pg_start_backup
> executes immediate checkpoint or not. Currently it always executes
> immediate one. But I'd like to run smoothed one in busy system.
> What's your opinion?

Yeah that sounds like a good idea. Shouldn't be too hard to do (will
reuqire a backend patch as well, of course). Should we use "-f" for
fast? Though that may be an unfortunate overload of the usual usecase
for -f, so maybe -c <fast|slow> for "checkpoint fast/slow"?

I've updated my git branch with the simple fixes, will get the bigger
ones in there as soon as I've done them.
--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

18 January 2011, 08:32:20

On Mon, Jan 17, 2011 at 16:27, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote:
>> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote:
>> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> >> Hmm. I don't like those names at all :(
>> >
>> > I agree.  I don't think your original names are bad, as long as
>> > they're well-documented.  I sympathize with Simon's desire to make it
>> > clear that these use the replication framework, but I really don't
>> > want the command names to be that long.
>>
>> Actually, after some IM chats, I think pg_streamrecv should be
>> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
>> is a lot more specific than that)
>
> pg_stream_log
> pg_stream_backup

Those seem better.

Tom, would those solve your concerns about it being clear which side
they are on? Or do you think you'd still risk reading them as the
sending side?


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

18 January 2011, 09:11:42

On Tue, Jan 18, 2011 at 12:40, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Jan 18, 2011 at 10:49, Fujii Masao <masao.fujii@gmail.com> wrote:
> Yeah that sounds like a good idea. Shouldn't be too hard to do (will
> reuqire a backend patch as well, of course). Should we use "-f" for
> fast? Though that may be an unfortunate overload of the usual usecase
> for -f, so maybe -c <fast|slow> for "checkpoint fast/slow"?

Was easy, done with "-c <fast|slow>".


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Alvaro Herrera

Date:

18 January 2011, 09:26:27

Excerpts from Magnus Hagander's message of mar ene 18 08:40:50 -0300 2011:
> On Tue, Jan 18, 2011 at 10:49, Fujii Masao <masao.fujii@gmail.com> wrote:

> > +                                       fprintf(stderr, _("%s: could not write to file '%s': %m\n"),
> >
> > %m in fprintf is portable?
> 
> Hmm. I just assumed it was because we use it elsewhere, but I now see
> we only really use it for ereport() stuff. Bottom line is, I don't
> know - perhaps it needs to be changed to use strerror()?

Some libc's (such as glibc) know about %m, others presumably don't (it's
a GNU extension, according to my manpage).  ereport does the expansion
by itself, see expand_fmt_string().  Probably just using strerror() is
the easiest.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

18 January 2011, 09:47:14

On Tue, Jan 18, 2011 at 14:26, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Magnus Hagander's message of mar ene 18 08:40:50 -0300 2011:
>> On Tue, Jan 18, 2011 at 10:49, Fujii Masao <masao.fujii@gmail.com> wrote:
>
>> > +                                       fprintf(stderr, _("%s: could not write to file '%s': %m\n"),
>> >
>> > %m in fprintf is portable?
>>
>> Hmm. I just assumed it was because we use it elsewhere, but I now see
>> we only really use it for ereport() stuff. Bottom line is, I don't
>> know - perhaps it needs to be changed to use strerror()?
>
> Some libc's (such as glibc) know about %m, others presumably don't (it's
> a GNU extension, according to my manpage).  ereport does the expansion
> by itself, see expand_fmt_string().  Probably just using strerror() is
> the easiest.

Ok, thanks for clarifying. I've updated to use strerror(). Guess it's
time for another patch, PFA :-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachment

pg_basebackup.patch

Re: pg_basebackup for streaming base backups

From

Alvaro Herrera

Date:

18 January 2011, 10:49:25

Excerpts from Magnus Hagander's message of mar ene 18 10:47:03 -0300 2011:

> Ok, thanks for clarifying. I've updated to use strerror(). Guess it's
> time for another patch, PFA :-)

Thanks ...  Message nitpick:
+   if (compresslevel > 0)
+   {
+       fprintf(stderr,
+               _("%s: this build does not support compression\n"),
+               progname);
+       exit(1);
+   }

pg_dump uses the following wording:

"WARNING: archive is compressed, but this installation does not support "
"compression -- no data will be available\n"

So perhaps yours should s/build/installation/


Also, in messages of this kind,
+               if (gzsetparams(ztarfile, compresslevel, Z_DEFAULT_STRATEGY) != Z_OK)
+               {
+                   fprintf(stderr, _("%s: could not set compression level %i\n"),
+                           progname, compresslevel);

Shouldn't you also be emitting the gzerror()? ... oh I see you're
already doing it for most gz calls.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

18 January 2011, 11:01:06

On Tue, Jan 18, 2011 at 15:49, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Magnus Hagander's message of mar ene 18 10:47:03 -0300 2011:
>
>> Ok, thanks for clarifying. I've updated to use strerror(). Guess it's
>> time for another patch, PFA :-)
>
> Thanks ...  Message nitpick:
> +   if (compresslevel > 0)
> +   {
> +       fprintf(stderr,
> +               _("%s: this build does not support compression\n"),
> +               progname);
> +       exit(1);
> +   }
>
> pg_dump uses the following wording:
>
> "WARNING: archive is compressed, but this installation does not support "
> "compression -- no data will be available\n"
>
> So perhaps yours should s/build/installation/

That shows up when a the *archive* is compressed, though? There are a
number of other cases that use build in the backend, such as:
src/backend/utils/misc/guc.c:               errmsg("assertion checking is not
supported by this build")));
src/backend/utils/misc/guc.c:                 errmsg("Bonjour is not supported by
this build")));
src/backend/utils/misc/guc.c:                 errmsg("SSL is not supported by this
build")));

> Also, in messages of this kind,
> +               if (gzsetparams(ztarfile, compresslevel, Z_DEFAULT_STRATEGY) != Z_OK)
> +               {
> +                   fprintf(stderr, _("%s: could not set compression level %i\n"),
> +                           progname, compresslevel);
>
> Shouldn't you also be emitting the gzerror()? ... oh I see you're
> already doing it for most gz calls.

It's not clear from the zlib documentation I have that gzerror() works
after a gzsetparams(). Do you have any docs that say differently by
any chance?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Tom Lane

Date:

18 January 2011, 12:31:36

Magnus Hagander <magnus@hagander.net> writes:
>>> Actually, after some IM chats, I think pg_streamrecv should be
>>> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
>>> is a lot more specific than that)

>> pg_stream_log
>> pg_stream_backup

> Those seem better.

> Tom, would those solve your concerns about it being clear which side
> they are on? Or do you think you'd still risk reading them as the
> sending side?

It's still totally unclear what they do.  How about "pg_receive_log"
etc?
        regards, tom lane

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

18 January 2011, 13:03:36

On Tue, Jan 18, 2011 at 17:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>>>> Actually, after some IM chats, I think pg_streamrecv should be
>>>> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
>>>> is a lot more specific than that)
>
>>> pg_stream_log
>>> pg_stream_backup
>
>> Those seem better.
>
>> Tom, would those solve your concerns about it being clear which side
>> they are on? Or do you think you'd still risk reading them as the
>> sending side?
>
> It's still totally unclear what they do.  How about "pg_receive_log"
> etc?

I agree with whomever said using "wal" is better than "log" to be unambiguous.

So it'd be pg_receive_wal and pg_receive_base_backup then? Votes from
others? (it's easy to rename so far, so I'll keep plugging away under
the name pg_basebackup based on Fujii-sans comments until such a time
as we have a reasonable consensus :-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Cédric Villemain

Date:

18 January 2011, 13:11:12

2011/1/18 Magnus Hagander <magnus@hagander.net>:
> On Tue, Jan 18, 2011 at 17:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Magnus Hagander <magnus@hagander.net> writes:
>>>>> Actually, after some IM chats, I think pg_streamrecv should be
>>>>> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
>>>>> is a lot more specific than that)
>>
>>>> pg_stream_log
>>>> pg_stream_backup
>>
>>> Those seem better.
>>
>>> Tom, would those solve your concerns about it being clear which side
>>> they are on? Or do you think you'd still risk reading them as the
>>> sending side?
>>
>> It's still totally unclear what they do.  How about "pg_receive_log"
>> etc?
>
> I agree with whomever said using "wal" is better than "log" to be unambiguous.
>
> So it'd be pg_receive_wal and pg_receive_base_backup then? Votes from
> others? (it's easy to rename so far, so I'll keep plugging away under
> the name pg_basebackup based on Fujii-sans comments until such a time
> as we have a reasonable consensus :-)

pg_receive_wal is good for me.
pg_receive_base_backup in french base is a shortcut for database. here
we backup the whole cluster, I would suggest
pg_receive_cluster(_backup ?)


>
>
> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: pg_basebackup for streaming base backups

From

Robert Haas

Date:

18 January 2011, 13:55:56

On Tue, Jan 18, 2011 at 12:03 PM, Magnus Hagander <magnus@hagander.net> wrote:
> So it'd be pg_receive_wal and pg_receive_base_backup then? Votes from
> others? (it's easy to rename so far, so I'll keep plugging away under
> the name pg_basebackup based on Fujii-sans comments until such a time
> as we have a reasonable consensus :-)

I like pg_receive_wal.  pg_receive_base_backup I would be inclined to
shorten to pg_basebackup or pg_streambackup, but I just work here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_basebackup for streaming base backups

From

Alvaro Herrera

Date:

18 January 2011, 14:21:09

Excerpts from Magnus Hagander's message of mar ene 18 11:53:55 -0300 2011:
> On Tue, Jan 18, 2011 at 15:49, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
> > Excerpts from Magnus Hagander's message of mar ene 18 10:47:03 -0300 2011:
> >
> >> Ok, thanks for clarifying. I've updated to use strerror(). Guess it's
> >> time for another patch, PFA :-)
> >
> > Thanks ...  Message nitpick:
> > +   if (compresslevel > 0)
> > +   {
> > +       fprintf(stderr,
> > +               _("%s: this build does not support compression\n"),
> > +               progname);
> > +       exit(1);
> > +   }
> >
> > pg_dump uses the following wording:
> >
> > "WARNING: archive is compressed, but this installation does not support "
> > "compression -- no data will be available\n"
> >
> > So perhaps yours should s/build/installation/
> 
> That shows up when a the *archive* is compressed, though? There are a
> number of other cases that use build in the backend, such as:
> src/backend/utils/misc/guc.c:               errmsg("assertion checking is not
> supported by this build")));
> src/backend/utils/misc/guc.c:                 errmsg("Bonjour is not supported by
> this build")));
> src/backend/utils/misc/guc.c:                 errmsg("SSL is not supported by this
> build")));

Hmm.  I think I'd s/build/installation/ on all those messages for
consistency.

> > Also, in messages of this kind,
> > +               if (gzsetparams(ztarfile, compresslevel, Z_DEFAULT_STRATEGY) != Z_OK)
> > +               {
> > +                   fprintf(stderr, _("%s: could not set compression level %i\n"),
> > +                           progname, compresslevel);
> >
> > Shouldn't you also be emitting the gzerror()? ... oh I see you're
> > already doing it for most gz calls.
> 
> It's not clear from the zlib documentation I have that gzerror() works
> after a gzsetparams(). Do you have any docs that say differently by
> any chance?

Ah, no.  I was reading zlib.h, which is ambiguous as you say:

ZEXTERN int ZEXPORT gzsetparams OF((gzFile file, int level, int strategy));
/*    Dynamically update the compression level or strategy. See the description  of deflateInit2 for the meaning of
theseparameters.    gzsetparams returns Z_OK if success, or Z_STREAM_ERROR if the file was not  opened for writing.
 
*/

ZEXTERN const char * ZEXPORT gzerror OF((gzFile file, int *errnum));
/*    Returns the error message for the last error which occurred on the  given compressed file. errnum is set to zlib
errornumber. If an  error occurred in the file system and not in the compression library,  errnum is set to Z_ERRNO and
theapplication may consult errno  to get the exact error code.
 


... but a quick look at the code says that it sets gz_stream->z_err
which is what gzerror returns:

int ZEXPORT gzsetparams (file, level, strategy)   gzFile file;   int level;   int strategy;
{   gz_stream *s = (gz_stream*)file;
   if (s == NULL || s->mode != 'w') return Z_STREAM_ERROR;
   /* Make room to allow flushing */   if (s->stream.avail_out == 0) {
       s->stream.next_out = s->outbuf;       if (fwrite(s->outbuf, 1, Z_BUFSIZE, s->file) != Z_BUFSIZE) {
s->z_err= Z_ERRNO;       }       s->stream.avail_out = Z_BUFSIZE;   }
 
   return deflateParams (&(s->stream), level, strategy);
}

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: pg_basebackup for streaming base backups

From

Simon Riggs

Date:

18 January 2011, 14:36:31

On Tue, 2011-01-18 at 18:03 +0100, Magnus Hagander wrote:

> So it'd be pg_receive_wal and pg_receive_base_backup then?

OK for me.

Maybe even pg_receive_wal_stream

Don't see any reason why command names can't be long. We have many
function names already that long.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

18 January 2011, 15:14:44

On Tue, Jan 18, 2011 at 19:20, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Magnus Hagander's message of mar ene 18 11:53:55 -0300 2011:
>> On Tue, Jan 18, 2011 at 15:49, Alvaro Herrera
>> <alvherre@commandprompt.com> wrote:
>> > Also, in messages of this kind,
>> > +               if (gzsetparams(ztarfile, compresslevel, Z_DEFAULT_STRATEGY) != Z_OK)
>> > +               {
>> > +                   fprintf(stderr, _("%s: could not set compression level %i\n"),
>> > +                           progname, compresslevel);
>> >
>> > Shouldn't you also be emitting the gzerror()? ... oh I see you're
>> > already doing it for most gz calls.
>>
>> It's not clear from the zlib documentation I have that gzerror() works
>> after a gzsetparams(). Do you have any docs that say differently by
>> any chance?
>
> Ah, no.  I was reading zlib.h, which is ambiguous as you say:
>
> ZEXTERN int ZEXPORT gzsetparams OF((gzFile file, int level, int strategy));
> /*
>     Dynamically update the compression level or strategy. See the description
>   of deflateInit2 for the meaning of these parameters.
>     gzsetparams returns Z_OK if success, or Z_STREAM_ERROR if the file was not
>   opened for writing.
> */
>
> ZEXTERN const char * ZEXPORT gzerror OF((gzFile file, int *errnum));
> /*
>     Returns the error message for the last error which occurred on the
>   given compressed file. errnum is set to zlib error number. If an
>   error occurred in the file system and not in the compression library,
>   errnum is set to Z_ERRNO and the application may consult errno
>   to get the exact error code.
>
>
> ... but a quick look at the code says that it sets gz_stream->z_err
> which is what gzerror returns:
>
> int ZEXPORT gzsetparams (file, level, strategy)
>    gzFile file;
>    int level;
>    int strategy;
> {
>    gz_stream *s = (gz_stream*)file;
>
>    if (s == NULL || s->mode != 'w') return Z_STREAM_ERROR;
>
>    /* Make room to allow flushing */
>    if (s->stream.avail_out == 0) {
>
>        s->stream.next_out = s->outbuf;
>        if (fwrite(s->outbuf, 1, Z_BUFSIZE, s->file) != Z_BUFSIZE) {
>            s->z_err = Z_ERRNO;
>        }
>        s->stream.avail_out = Z_BUFSIZE;
>    }
>
>    return deflateParams (&(s->stream), level, strategy);
> }

Ah, ok. I've added the errorcode now, PFA. I also fixed an error in
the change for result codes I broke in the last patch. github branch
updated as usual.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachment

pg_basebackup.patch

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

19 January 2011, 00:12:33

On Tue, Jan 18, 2011 at 8:40 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> When I untar the tar file taken by pg_basebackup, I got the following
>> messages:
>>
>>    $ tar xf base.tar
>>    tar: Skipping to next header
>>    tar: Archive contains obsolescent base-64 headers
>>    tar: Error exit delayed from previous errors
>>
>> Is this a bug? This happens only when I create $PGDATA by using
>> initdb -X (i.e., I relocated the pg_xlog directory elsewhere than
>> $PGDATA).
>
> Interesting. What version of tar and what platform? I can't reproduce
> that here...

$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.4 (Tikanga)

$ uname -a
Linux hermes 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64
x86_64 x86_64 GNU/Linux

$ tar --version
tar (GNU tar) 1.15.1

>> +               r = PQgetCopyData(conn, ©buf, 0);
>> +               if (r == -1)
>>
>> Since -1 of PQgetCopyData might indicate an error, in this case,
>> we would need to call PQgetResult?.
>
> Uh, -1 means end of data, no? -2 means error?

The comment in pqGetCopyData3 says
/* * On end-of-copy, exit COPY_OUT or COPY_BOTH mode and let caller * read status with PQgetResult().    The normal
caseis that it's * Copy Done, but we let parseInput read that.  If error, we expect * the state was already changed. */ 

Also the comment in getCopyDataMessage says
/* * If it's a legitimate async message type, process it.  (NOTIFY * messages are not currently possible here, but we
handlethem for * completeness.)  Otherwise, if it's anything except Copy Data, * report end-of-copy. */ 

So I thought that. BTW, walreceiver has already done that.

>> ReceiveTarFile seems refactorable by using GZWRITE and GZCLOSE
>> macros.
>
> You mean the ones from pg_dump? I don't think so. We can't use
> gzwrite() with compression level 0 on the tar output, because it will
> still write a gz header. With pg_dump, that is ok because it's our
> format, but with a .tar (without .gz) I don't think it is.

Right. I withdrow the comment.

>> +       /*
>> +        * Make sure we're unpacking into an empty directory
>> +        */
>> +       verify_dir_is_empty_or_create(current_path);
>>
>> Can pg_basebackup take a backup of $PGDATA including a tablespace
>> directory, without an error? The above code seems to prevent that....
>
> Uh, how do you mean it woul dprevent that? It requires that the
> directory you're writing the tablespace to is empty or nonexistant,
> but that shouldn't prevent a backup, no? It will prevent you from
> overwriting things with your backup, but that's intentional - if you
> don't need the old dir, just remove it.

What I'm worried about is the case where a tablespace is created
under the $PGDATA directory. In this case, ISTM that pg_basebackup
takes the backup of $PGDATA including the tablespace directory first,
and then takes the backup of the tablespace directory again. But,
since the tablespace directory is not already empty, the backup of
the tablespace seems to fail.

> Was easy, done with "-c <fast|slow>".

Thanks a lot!

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

19 January 2011, 01:14:11

On Wed, Jan 19, 2011 at 4:12 AM, Magnus Hagander <magnus@hagander.net> wrote:
> Ah, ok. I've added the errorcode now, PFA. I also fixed an error in
> the change for result codes I broke in the last patch. github branch
> updated as usual.

Great. Thanks for the quick update!

Here are another comments:

+ * IDENTIFICATION
+ *          src/bin/pg_basebackup.c

Typo: s/"src/bin/pg_basebackup.c"/"src/bin/pg_basebackup/pg_basebackup.c"

+    printf(_("  -c, --checkpoint=fast|slow\n"
+             "                            set fast or slow checkpoinging\n"));

Typo: s/checkpoinging/checkpointing

The "fast or slow" seems to lead users to always choose "fast". Instead,
what about "fast or smooth", "fast or spread" or "immediate or delayed"?

You seem to have forgotten to support "--checkpoint" long option.
The struct long_options needs to be updated.

What if pg_basebackup receives a signal while doing a backup?
For example, users might do Ctrl-C to cancel the long-running backup.
We should define a signal handler and send a Terminate message
to the server to cancel the backup?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

19 January 2011, 08:37:31

On Wed, Jan 19, 2011 at 06:14, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Jan 19, 2011 at 4:12 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> Ah, ok. I've added the errorcode now, PFA. I also fixed an error in
>> the change for result codes I broke in the last patch. github branch
>> updated as usual.
>
> Great. Thanks for the quick update!
>
> Here are another comments:
>
> + * IDENTIFICATION
> + *               src/bin/pg_basebackup.c
>
> Typo: s/"src/bin/pg_basebackup.c"/"src/bin/pg_basebackup/pg_basebackup.c"

Oops.


> +       printf(_("  -c, --checkpoint=fast|slow\n"
> +                        "                            set fast or slow checkpoinging\n"));
>
> Typo: s/checkpoinging/checkpointing
>
> The "fast or slow" seems to lead users to always choose "fast". Instead,
> what about "fast or smooth", "fast or spread" or "immediate or delayed"?

Hmm. "fast or spread" seems reasonable to me. And I want to use "fast"
for the fast version, because that's what we call it when you use
pg_start_backup(). I'll go change it to spread for now  - it's the one
I can find used in the docs.


> You seem to have forgotten to support "--checkpoint" long option.
> The struct long_options needs to be updated.

Wow, that clearly went too fast. Fixed as wlel.


> What if pg_basebackup receives a signal while doing a backup?
> For example, users might do Ctrl-C to cancel the long-running backup.
> We should define a signal handler and send a Terminate message
> to the server to cancel the backup?

Nah, we'll just disconnect and it'll deal with things that way. Just
like we do with e.g. pg_dump. I don't see the need to complicate it
with that.

(new patch on github in 5 minutes)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Tom Lane

Date:

19 January 2011, 13:22:10

Fujii Masao <masao.fujii@gmail.com> writes:
> What I'm worried about is the case where a tablespace is created
> under the $PGDATA directory.

What would be the sense of that?  If you're concerned about whether the
code handles it correctly, maybe the right solution is to add code to
CREATE TABLESPACE to disallow it.
        regards, tom lane

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

19 January 2011, 21:33:13

On Wed, Jan 19, 2011 at 9:37 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> The "fast or slow" seems to lead users to always choose "fast". Instead,
>> what about "fast or smooth", "fast or spread" or "immediate or delayed"?
>
> Hmm. "fast or spread" seems reasonable to me. And I want to use "fast"
> for the fast version, because that's what we call it when you use
> pg_start_backup(). I'll go change it to spread for now  - it's the one
> I can find used in the docs.

Fair enough.

>> What if pg_basebackup receives a signal while doing a backup?
>> For example, users might do Ctrl-C to cancel the long-running backup.
>> We should define a signal handler and send a Terminate message
>> to the server to cancel the backup?
>
> Nah, we'll just disconnect and it'll deal with things that way. Just
> like we do with e.g. pg_dump. I don't see the need to complicate it
> with that.

Okay.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

19 January 2011, 21:42:29

On Thu, Jan 20, 2011 at 2:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Fujii Masao <masao.fujii@gmail.com> writes:
>> What I'm worried about is the case where a tablespace is created
>> under the $PGDATA directory.
>
> What would be the sense of that?  If you're concerned about whether the
> code handles it correctly, maybe the right solution is to add code to
> CREATE TABLESPACE to disallow it.

I'm not sure why that's the right solution. Why do you think that we should
not create the tablespace under the $PGDATA directory? I'm not surprised
that people mounts the filesystem on $PGDATA/mnt and creates the
tablespace on it.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Tom Lane

Date:

19 January 2011, 21:53:20

Fujii Masao <masao.fujii@gmail.com> writes:
> On Thu, Jan 20, 2011 at 2:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Fujii Masao <masao.fujii@gmail.com> writes:
>>> What I'm worried about is the case where a tablespace is created
>>> under the $PGDATA directory.

>> What would be the sense of that? �If you're concerned about whether the
>> code handles it correctly, maybe the right solution is to add code to
>> CREATE TABLESPACE to disallow it.

> I'm not sure why that's the right solution. Why do you think that we should
> not create the tablespace under the $PGDATA directory? I'm not surprised
> that people mounts the filesystem on $PGDATA/mnt and creates the
> tablespace on it.

No?  Usually, having a mount point in a non-root-owned directory is
considered a Bad Thing.
        regards, tom lane

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

19 January 2011, 22:07:12

On Thu, Jan 20, 2011 at 10:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm not sure why that's the right solution. Why do you think that we should
>> not create the tablespace under the $PGDATA directory? I'm not surprised
>> that people mounts the filesystem on $PGDATA/mnt and creates the
>> tablespace on it.
>
> No?  Usually, having a mount point in a non-root-owned directory is
> considered a Bad Thing.

Hmm.. but ISTM we can have a root-owned mount point in $PGDATA
and create a tablespace there.
   $ su -   # mkdir $PGDATA/mnt   # mount -t tmpfs tmpfs $PGDATA/mnt   # exit   $ mkdir $PGDATA/mnt/tblspcdir   $ psql
=# CREATE TABLESPACE tblspc LOCATION '$PGDATA/mnt/tblspcdir';   CREATE TABLESPACE 

Am I missing something?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

20 January 2011, 00:23:48

On Wed, Jan 19, 2011 at 9:37 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> Great. Thanks for the quick update!
>>
>> Here are another comments:

Here are comments against the documents. The other code looks good.

It's helpful to document what to set to allow pg_basebackup connection.
That is not only the REPLICATION privilege but also max_wal_senders and
pg_hba.conf.

+ <refsect1>
+  <title>Options</title>

Can we list the descriptions of option in the same order as
"pg_basebackup --help" does?

It's helpful to document that the target directory must be specified and
it must be empty.

+  <para>
+   The backup will include all files in the data directory and tablespaces,
+   including the configuration files and any additional files placed in the
+   directory by third parties. Only regular files and directories are allowed
+   in the data directory, no symbolic links or special device files.

The latter sentence means that the backup of the database cluster
created by initdb -X is not supported? Because the symlink to the
actual WAL directory is included in it.

OTOH, I found the following source code comments:

+ * Receive a tar format stream from the connection to the server, and unpack
+ * the contents of it into a directory. Only files, directories and
+ * symlinks are supported, no other kinds of special files.

This says that symlinks are supported. Which is true? Is the symlink
supported only in tar format?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

20 January 2011, 07:43:00

On Thu, Jan 20, 2011 at 05:23, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Jan 19, 2011 at 9:37 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>> Great. Thanks for the quick update!
>>>
>>> Here are another comments:
>
> Here are comments against the documents. The other code looks good.

Thanks!

> It's helpful to document what to set to allow pg_basebackup connection.
> That is not only the REPLICATION privilege but also max_wal_senders and
> pg_hba.conf.

Hmm. Yeha, i guess that wouldn't hurt. Will add that.

> + <refsect1>
> +  <title>Options</title>
>
> Can we list the descriptions of option in the same order as
> "pg_basebackup --help" does?
>
> It's helpful to document that the target directory must be specified and
> it must be empty.

Yeah, that's on the list - I just wanted to make any other changes
first before I did that. I based on (no further) feedback and a few
extra questions, I'm going to change it per your suggestion to use -D
<dir> -F <format>, instead of -D/-T, which will change that stuff
anyway. So I'll reorder them at that time.

> +  <para>
> +   The backup will include all files in the data directory and tablespaces,
> +   including the configuration files and any additional files placed in the
> +   directory by third parties. Only regular files and directories are allowed
> +   in the data directory, no symbolic links or special device files.
>
> The latter sentence means that the backup of the database cluster
> created by initdb -X is not supported? Because the symlink to the
> actual WAL directory is included in it.

No, it's not. pg_xlog is specifically excluded, and sent as an empty
directory, so upon restore you will have an empty pg_xlog directory.

> OTOH, I found the following source code comments:
>
> + * Receive a tar format stream from the connection to the server, and unpack
> + * the contents of it into a directory. Only files, directories and
> + * symlinks are supported, no other kinds of special files.
>
> This says that symlinks are supported. Which is true? Is the symlink
> supported only in tar format?

That's actually a *backend* side restriction. If there is a symlink
anywhere other than pg_tblspc in the data directory, we simply won't
send it across (with a warning).

The frontend code supports creating symlinks, both in directory format
and in tar format (actually, in tar format it doesn't do anything, of
course, it just lets it through)

It wouldn't actually be hard to allow the inclusion of symlinks in the
backend side. But it would make verification a lot harder - for
example, if someone symlinked out pg_clog (as an example), we'd back
up the symlink but not the actual files since they're not actually
registered as a tablespace.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

20 January 2011, 08:51:52

On Thu, Jan 20, 2011 at 12:42, Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Jan 20, 2011 at 05:23, Fujii Masao <masao.fujii@gmail.com> wrote:
>> It's helpful to document what to set to allow pg_basebackup connection.
>> That is not only the REPLICATION privilege but also max_wal_senders and
>> pg_hba.conf.
>
> Hmm. Yeha, i guess that wouldn't hurt. Will add that.

Added, see github branch.


>> + <refsect1>
>> +  <title>Options</title>
>>
>> Can we list the descriptions of option in the same order as
>> "pg_basebackup --help" does?
>>
>> It's helpful to document that the target directory must be specified and
>> it must be empty.
>
> Yeah, that's on the list - I just wanted to make any other changes
> first before I did that. I based on (no further) feedback and a few
> extra questions, I'm going to change it per your suggestion to use -D
> <dir> -F <format>, instead of -D/-T, which will change that stuff
> anyway. So I'll reorder them at that time.

Updated on github.


>> +  <para>
>> +   The backup will include all files in the data directory and tablespaces,
>> +   including the configuration files and any additional files placed in the
>> +   directory by third parties. Only regular files and directories are allowed
>> +   in the data directory, no symbolic links or special device files.
>>
>> The latter sentence means that the backup of the database cluster
>> created by initdb -X is not supported? Because the symlink to the
>> actual WAL directory is included in it.
>
> No, it's not. pg_xlog is specifically excluded, and sent as an empty
> directory, so upon restore you will have an empty pg_xlog directory.

Actually, when I verified that statement, I found a bug where we sent
the wrong thing if pg_xlog was a symlink, leading to a corrupt
tarfile! Patch is in the github branch.


>> OTOH, I found the following source code comments:
>>
>> + * Receive a tar format stream from the connection to the server, and unpack
>> + * the contents of it into a directory. Only files, directories and
>> + * symlinks are supported, no other kinds of special files.
>>
>> This says that symlinks are supported. Which is true? Is the symlink
>> supported only in tar format?
>
> That's actually a *backend* side restriction. If there is a symlink
> anywhere other than pg_tblspc in the data directory, we simply won't
> send it across (with a warning).
>
> The frontend code supports creating symlinks, both in directory format
> and in tar format (actually, in tar format it doesn't do anything, of
> course, it just lets it through)
>
> It wouldn't actually be hard to allow the inclusion of symlinks in the
> backend side. But it would make verification a lot harder - for
> example, if someone symlinked out pg_clog (as an example), we'd back
> up the symlink but not the actual files since they're not actually
> registered as a tablespace.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Bruce Momjian

Date:

20 January 2011, 11:01:26

Magnus Hagander wrote:
> On Mon, Jan 17, 2011 at 16:27, Simon Riggs <simon@2ndquadrant.com> wrote:
> > On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote:
> >> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote:
> >> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote:
> >> >> Hmm. I don't like those names at all :(
> >> >
> >> > I agree. ?I don't think your original names are bad, as long as
> >> > they're well-documented. ?I sympathize with Simon's desire to make it
> >> > clear that these use the replication framework, but I really don't
> >> > want the command names to be that long.
> >>
> >> Actually, after some IM chats, I think pg_streamrecv should be
> >> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
> >> is a lot more specific than that)
> >
> > pg_stream_log
> > pg_stream_backup
> 
> Those seem better.
> 
> Tom, would those solve your concerns about it being clear which side
> they are on? Or do you think you'd still risk reading them as the
> sending side?

It seems pg_create_backup would be the most natural because we already
have pg_start_backup and pg_stop_backup.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_basebackup for streaming base backups

From

Robert Haas

Date:

20 January 2011, 11:06:43

On Thu, Jan 20, 2011 at 10:01 AM, Bruce Momjian <bruce@momjian.us> wrote:
> Magnus Hagander wrote:
>> On Mon, Jan 17, 2011 at 16:27, Simon Riggs <simon@2ndquadrant.com> wrote:
>> > On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote:
>> >> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote:
>> >> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> >> >> Hmm. I don't like those names at all :(
>> >> >
>> >> > I agree. ?I don't think your original names are bad, as long as
>> >> > they're well-documented. ?I sympathize with Simon's desire to make it
>> >> > clear that these use the replication framework, but I really don't
>> >> > want the command names to be that long.
>> >>
>> >> Actually, after some IM chats, I think pg_streamrecv should be
>> >> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
>> >> is a lot more specific than that)
>> >
>> > pg_stream_log
>> > pg_stream_backup
>>
>> Those seem better.
>>
>> Tom, would those solve your concerns about it being clear which side
>> they are on? Or do you think you'd still risk reading them as the
>> sending side?
>
> It seems pg_create_backup would be the most natural because we already
> have pg_start_backup and pg_stop_backup.

Uh, wow.  That's really mixing apples and oranges.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_basebackup for streaming base backups

From

Bruce Momjian

Date:

20 January 2011, 11:15:52

Robert Haas wrote:
> On Thu, Jan 20, 2011 at 10:01 AM, Bruce Momjian <bruce@momjian.us> wrote:
> > Magnus Hagander wrote:
> >> On Mon, Jan 17, 2011 at 16:27, Simon Riggs <simon@2ndquadrant.com> wrote:
> >> > On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote:
> >> >> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote:
> >> >> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote:
> >> >> >> Hmm. I don't like those names at all :(
> >> >> >
> >> >> > I agree. ?I don't think your original names are bad, as long as
> >> >> > they're well-documented. ?I sympathize with Simon's desire to make it
> >> >> > clear that these use the replication framework, but I really don't
> >> >> > want the command names to be that long.
> >> >>
> >> >> Actually, after some IM chats, I think pg_streamrecv should be
> >> >> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
> >> >> is a lot more specific than that)
> >> >
> >> > pg_stream_log
> >> > pg_stream_backup
> >>
> >> Those seem better.
> >>
> >> Tom, would those solve your concerns about it being clear which side
> >> they are on? Or do you think you'd still risk reading them as the
> >> sending side?
> >
> > It seems pg_create_backup would be the most natural because we already
> > have pg_start_backup and pg_stop_backup.
> 
> Uh, wow.  That's really mixing apples and oranges.

I read the description as:

+    You can also use the <xref linkend="app-pgbasebackup"> tool to take
+    the backup, instead of manually copying the files. This tool will take
+    care of the <function>pg_start_backup()</>, copy and
+    <function>pg_stop_backup()</> steps automatically, and transfers the
+    backup over a regular <productname>PostgreSQL</productname> connection
+    using the replication protocol, instead of requiring filesystem level
+    access.

so I thought, well it does pg_start_backup and pg_stop_backup, and also
creates the data directory.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_basebackup for streaming base backups

From

Robert Haas

Date:

20 January 2011, 11:20:22

On Thu, Jan 20, 2011 at 10:15 AM, Bruce Momjian <bruce@momjian.us> wrote:
> Robert Haas wrote:
>> On Thu, Jan 20, 2011 at 10:01 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> > Magnus Hagander wrote:
>> >> On Mon, Jan 17, 2011 at 16:27, Simon Riggs <simon@2ndquadrant.com> wrote:
>> >> > On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote:
>> >> >> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote:
>> >> >> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> >> >> >> Hmm. I don't like those names at all :(
>> >> >> >
>> >> >> > I agree. ?I don't think your original names are bad, as long as
>> >> >> > they're well-documented. ?I sympathize with Simon's desire to make it
>> >> >> > clear that these use the replication framework, but I really don't
>> >> >> > want the command names to be that long.
>> >> >>
>> >> >> Actually, after some IM chats, I think pg_streamrecv should be
>> >> >> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream
>> >> >> is a lot more specific than that)
>> >> >
>> >> > pg_stream_log
>> >> > pg_stream_backup
>> >>
>> >> Those seem better.
>> >>
>> >> Tom, would those solve your concerns about it being clear which side
>> >> they are on? Or do you think you'd still risk reading them as the
>> >> sending side?
>> >
>> > It seems pg_create_backup would be the most natural because we already
>> > have pg_start_backup and pg_stop_backup.
>>
>> Uh, wow.  That's really mixing apples and oranges.
>
> I read the description as:
>
> +    You can also use the <xref linkend="app-pgbasebackup"> tool to take
> +    the backup, instead of manually copying the files. This tool will take
> +    care of the <function>pg_start_backup()</>, copy and
> +    <function>pg_stop_backup()</> steps automatically, and transfers the
> +    backup over a regular <productname>PostgreSQL</productname> connection
> +    using the replication protocol, instead of requiring filesystem level
> +    access.
>
> so I thought, well it does pg_start_backup and pg_stop_backup, and also
> creates the data directory.

Yeah, but pg_start_backup() and pg_stop_backup() are server functions,
and this is an application.

Also, it won't actually work unless the server has replication
configured (wal_level!=minimal, max_wal_senders>0, and possibly some
setting for wal_keep_segments), which has been the main point of the
naming discussion thus far.  Now, you know what would be REALLY cool?
Making this work without any special advance configuration.  Like if
we somehow figured out a way to make max_wal_senders unnecessary, and
a way to change wal_level without bouncing the server, so that we
could temporarily boost the WAL level from minimal to archive if
someone's running a backup.

That, however, is not going to happen for 9.1... but it would be *really* cool.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_basebackup for streaming base backups

From

Bruce Momjian

Date:

20 January 2011, 11:45:47

Robert Haas wrote:
> > I read the description as:
> >
> > + ? ?You can also use the <xref linkend="app-pgbasebackup"> tool to take
> > + ? ?the backup, instead of manually copying the files. This tool will take
> > + ? ?care of the <function>pg_start_backup()</>, copy and
> > + ? ?<function>pg_stop_backup()</> steps automatically, and transfers the
> > + ? ?backup over a regular <productname>PostgreSQL</productname> connection
> > + ? ?using the replication protocol, instead of requiring filesystem level
> > + ? ?access.
> >
> > so I thought, well it does pg_start_backup and pg_stop_backup, and also
> > creates the data directory.
> 
> Yeah, but pg_start_backup() and pg_stop_backup() are server functions,
> and this is an application.
> 
> Also, it won't actually work unless the server has replication
> configured (wal_level!=minimal, max_wal_senders>0, and possibly some
> setting for wal_keep_segments), which has been the main point of the
> naming discussion thus far.  Now, you know what would be REALLY cool?
> Making this work without any special advance configuration.  Like if
> we somehow figured out a way to make max_wal_senders unnecessary, and
> a way to change wal_level without bouncing the server, so that we
> could temporarily boost the WAL level from minimal to archive if
> someone's running a backup.
> 
> That, however, is not going to happen for 9.1... but it would be
> *really* cool.

Well, when we originally implemented PITR, we could have found a way to
avoid using SQL commands to start/stop backup, but we envisioned that we
would want to hook things on to those commands so we created a stable
API that we could improve, and we have.

Do we envision pg_basebackup as something we will enahance, and if so,
should we consider a generic name?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: pg_basebackup for streaming base backups

From

Tom Lane

Date:

20 January 2011, 12:00:54

Fujii Masao <masao.fujii@gmail.com> writes:
> On Thu, Jan 20, 2011 at 10:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I'm not sure why that's the right solution. Why do you think that we should
>>> not create the tablespace under the $PGDATA directory? I'm not surprised
>>> that people mounts the filesystem on $PGDATA/mnt and creates the
>>> tablespace on it.

>> No? �Usually, having a mount point in a non-root-owned directory is
>> considered a Bad Thing.

> Hmm.. but ISTM we can have a root-owned mount point in $PGDATA
> and create a tablespace there.

Nonsense.  The more general statement is that it's a security hole
unless the mount point *and everything above it* is root owned.
In the case you sketch, there would be nothing to stop the (non root)
postgres user from renaming $PGDATA/mnt to something else and then
inserting his own trojan-horse directories.

Given that nobody except postgres and root could get to the mount point,
maybe there wouldn't be any really serious problems caused that way ---
but I still say that it's bad practice that no competent sysadmin would
accept.

Moreover, I see no positive *good* reason to do it.  There isn't
anyplace under $PGDATA that users should be randomly creating
directories, much less mount points.
        regards, tom lane

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

20 January 2011, 12:17:49

On Thu, Jan 20, 2011 at 16:45, Bruce Momjian <bruce@momjian.us> wrote:
> Robert Haas wrote:
>> > I read the description as:
>> >
>> > + ? ?You can also use the <xref linkend="app-pgbasebackup"> tool to take
>> > + ? ?the backup, instead of manually copying the files. This tool will take
>> > + ? ?care of the <function>pg_start_backup()</>, copy and
>> > + ? ?<function>pg_stop_backup()</> steps automatically, and transfers the
>> > + ? ?backup over a regular <productname>PostgreSQL</productname> connection
>> > + ? ?using the replication protocol, instead of requiring filesystem level
>> > + ? ?access.
>> >
>> > so I thought, well it does pg_start_backup and pg_stop_backup, and also
>> > creates the data directory.
>>
>> Yeah, but pg_start_backup() and pg_stop_backup() are server functions,
>> and this is an application.
>>
>> Also, it won't actually work unless the server has replication
>> configured (wal_level!=minimal, max_wal_senders>0, and possibly some
>> setting for wal_keep_segments), which has been the main point of the
>> naming discussion thus far.  Now, you know what would be REALLY cool?
>> Making this work without any special advance configuration.  Like if
>> we somehow figured out a way to make max_wal_senders unnecessary, and
>> a way to change wal_level without bouncing the server, so that we
>> could temporarily boost the WAL level from minimal to archive if
>> someone's running a backup.
>>
>> That, however, is not going to happen for 9.1... but it would be
>> *really* cool.
>
> Well, when we originally implemented PITR, we could have found a way to
> avoid using SQL commands to start/stop backup, but we envisioned that we
> would want to hook things on to those commands so we created a stable
> API that we could improve, and we have.

Yeah, we're certianly not taking those *away*.


> Do we envision pg_basebackup as something we will enahance, and if so,
> should we consider a generic name?

Well, it's certainly going to be enhanced. I think there are two main
uses for it - backups, and setting up replication slaves. I can't see
it expanding beyond those, really.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Dimitri Fontaine

Date:

20 January 2011, 12:59:34

Robert Haas <robertmhaas@gmail.com> writes:
> Also, it won't actually work unless the server has replication
> configured (wal_level!=minimal, max_wal_senders>0, and possibly some
> setting for wal_keep_segments), which has been the main point of the
> naming discussion thus far.  Now, you know what would be REALLY cool?
> Making this work without any special advance configuration.  Like if
> we somehow figured out a way to make max_wal_senders unnecessary, and
> a way to change wal_level without bouncing the server, so that we
> could temporarily boost the WAL level from minimal to archive if
> someone's running a backup.

Not using max_wal_senders we're on our way, you "just" have to use the
external walreceiver that Magnus the code for already.  WAL level, I
don't know that we have that already, but a big part of what this base
backup tool is useful for is preparing a standby… so certainly you want
to change that setup there.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: pg_basebackup for streaming base backups

From

Robert Haas

Date:

20 January 2011, 13:56:53

On Thu, Jan 20, 2011 at 11:59 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Also, it won't actually work unless the server has replication
>> configured (wal_level!=minimal, max_wal_senders>0, and possibly some
>> setting for wal_keep_segments), which has been the main point of the
>> naming discussion thus far.  Now, you know what would be REALLY cool?
>> Making this work without any special advance configuration.  Like if
>> we somehow figured out a way to make max_wal_senders unnecessary, and
>> a way to change wal_level without bouncing the server, so that we
>> could temporarily boost the WAL level from minimal to archive if
>> someone's running a backup.
>
> Not using max_wal_senders we're on our way, you "just" have to use the
> external walreceiver that Magnus the code for already.  WAL level, I
> don't know that we have that already, but a big part of what this base
> backup tool is useful for is preparing a standby… so certainly you want
> to change that setup there.

Well, yeah, but it would be nice to also use it just to take a regular
old backup on a system that doesn't otherwise need replication.

I think that the basic problem with wal_level is that to increase it
you need to somehow ensure that all the backends have the new setting,
and then checkpoint.  Right now, the backends get the value through
the GUC machinery, and so there's no particular bound on how long it
could take for them to pick up the new value.  I think if we could
find some way of making sure that the backends got the new value in a
reasonably timely fashion, we'd be pretty close to being able to do
this.  But it's hard to see how to do that.

I had some vague idea of creating a mechanism for broadcasting
critical parameter changes.  You'd make a structure in shared memory
containing the "canonical" values of wal_level and all other critical
variables, and the structure would also contain a 64-bit counter.
Whenever you want to make a parameter change, you lock the structure,
make your change, bump the counter, and release the lock.  Then,
there's a second structure, also in shared memory, where backends
report the value that the counter had the last time they updated their
local copies of the structure from the shared structure.  You can
watch that to find out when everyone's guaranteed to have the new
value.  If someone doesn't respond quickly enough, you could send them
a signal to get them moving.  What would really be ideal is if you
could actually make this safe enough that the interrupt service
routine could do all the work, rather than just setting a flag.  Or
maybe CHECK_FOR_INTERRUPTS().  If you can't make it safe enough to put
it in someplace pretty low-level like that, the whole idea might fall
apart, because it wouldn't be useful to have a way of doing this that
mostly works except sometimes it just sits there and hangs for a
really long time.

All pie in the sky at this point...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_basebackup for streaming base backups

From

Dimitri Fontaine

Date:

20 January 2011, 15:10:39

Robert Haas <robertmhaas@gmail.com> writes:
> I think that the basic problem with wal_level is that to increase it
> you need to somehow ensure that all the backends have the new setting,
> and then checkpoint.  Right now, the backends get the value through
> the GUC machinery, and so there's no particular bound on how long it
> could take for them to pick up the new value.  I think if we could
> find some way of making sure that the backends got the new value in a
> reasonably timely fashion, we'd be pretty close to being able to do
> this.  But it's hard to see how to do that.

Well, you just said when to force the "reload" to take effect: at
checkpoint time.  IIRC we already multiplex SIGUSR1, is that possible to
add that behavior here?  And signal every backend at checkpoint time
when wal_level has changed?

> I had some vague idea of creating a mechanism for broadcasting
> critical parameter changes.  You'd make a structure in shared memory
> containing the "canonical" values of wal_level and all other critical
> variables, and the structure would also contain a 64-bit counter.
> Whenever you want to make a parameter change, you lock the structure,
> make your change, bump the counter, and release the lock.  Then,
> there's a second structure, also in shared memory, where backends
> report the value that the counter had the last time they updated their
> local copies of the structure from the shared structure.  You can
> watch that to find out when everyone's guaranteed to have the new
> value.  If someone doesn't respond quickly enough, you could send them
> a signal to get them moving.  What would really be ideal is if you
> could actually make this safe enough that the interrupt service
> routine could do all the work, rather than just setting a flag.  Or
> maybe CHECK_FOR_INTERRUPTS().  If you can't make it safe enough to put
> it in someplace pretty low-level like that, the whole idea might fall
> apart, because it wouldn't be useful to have a way of doing this that
> mostly works except sometimes it just sits there and hangs for a
> really long time.
>
> All pie in the sky at this point...

Unless we manage to simplify enough the idea to have wal_level SIGHUP.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: pg_basebackup for streaming base backups

From

Robert Haas

Date:

20 January 2011, 15:51:36

On Thu, Jan 20, 2011 at 2:10 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I think that the basic problem with wal_level is that to increase it
>> you need to somehow ensure that all the backends have the new setting,
>> and then checkpoint.  Right now, the backends get the value through
>> the GUC machinery, and so there's no particular bound on how long it
>> could take for them to pick up the new value.  I think if we could
>> find some way of making sure that the backends got the new value in a
>> reasonably timely fashion, we'd be pretty close to being able to do
>> this.  But it's hard to see how to do that.
>
> Well, you just said when to force the "reload" to take effect: at
> checkpoint time.  IIRC we already multiplex SIGUSR1, is that possible to
> add that behavior here?  And signal every backend at checkpoint time
> when wal_level has changed?

Sending them a signal seems like a promising approach, but the trick
is guaranteeing that they've actually acted on it before you start the
checkpoint.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_basebackup for streaming base backups

From

Dimitri Fontaine

Date:

20 January 2011, 16:15:52

Robert Haas <robertmhaas@gmail.com> writes:
> Sending them a signal seems like a promising approach, but the trick
> is guaranteeing that they've actually acted on it before you start the
> checkpoint.

How much using a latch here would help?  Or be overkill?

-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: pg_basebackup for streaming base backups

From

Heikki Linnakangas

Date:

20 January 2011, 16:21:19

On 20.01.2011 22:15, Dimitri Fontaine wrote:
> Robert Haas<robertmhaas@gmail.com>  writes:
>> Sending them a signal seems like a promising approach, but the trick
>> is guaranteeing that they've actually acted on it before you start the
>> checkpoint.
>
> How much using a latch here would help?  Or be overkill?

A latch doesn't give you an acknowledgment from the backends that 
they've received and acted on the guc change. You could use it as a 
building block to construct that, though.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: pg_basebackup for streaming base backups

From

Tom Lane

Date:

20 January 2011, 17:09:02

Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jan 20, 2011 at 2:10 PM, Dimitri Fontaine
> <dimitri@2ndquadrant.fr> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> I think that the basic problem with wal_level is that to increase it
>>> you need to somehow ensure that all the backends have the new setting,
>>> and then checkpoint.
>> 
>> Well, you just said when to force the "reload" to take effect: at
>> checkpoint time. �IIRC we already multiplex SIGUSR1, is that possible to
>> add that behavior here? �And signal every backend at checkpoint time
>> when wal_level has changed?

> Sending them a signal seems like a promising approach, but the trick
> is guaranteeing that they've actually acted on it before you start the
> checkpoint.

Have the backends show their current wal_level in their PGPROC entries.
Sleep till they're all reporting the right thing, then fire checkpoint.
        regards, tom lane

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

21 January 2011, 02:02:59

On Fri, Jan 21, 2011 at 1:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Fujii Masao <masao.fujii@gmail.com> writes:
>> On Thu, Jan 20, 2011 at 10:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I'm not sure why that's the right solution. Why do you think that we should
>>>> not create the tablespace under the $PGDATA directory? I'm not surprised
>>>> that people mounts the filesystem on $PGDATA/mnt and creates the
>>>> tablespace on it.
>
>>> No?  Usually, having a mount point in a non-root-owned directory is
>>> considered a Bad Thing.
>
>> Hmm.. but ISTM we can have a root-owned mount point in $PGDATA
>> and create a tablespace there.
>
> Nonsense.  The more general statement is that it's a security hole
> unless the mount point *and everything above it* is root owned.

Probably true. But we cannot create a tablespace for root-owned directory.
The directory must be owned by the PostgreSQL system user. So ISTM that
you says that creating a tablespace on a mount point itself is a security hole.

> In the case you sketch, there would be nothing to stop the (non root)
> postgres user from renaming $PGDATA/mnt to something else and then
> inserting his own trojan-horse directories.

Hmm.. can non-root postgres user really rename the root-owned directory
while it's being mounted?

> Moreover, I see no positive *good* reason to do it.  There isn't
> anyplace under $PGDATA that users should be randomly creating
> directories, much less mount points.

When taking a base backup, you don't need to take a backup of tablespaces
separately from that of $PGDATA. You have only to take a backup of $PGDATA.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

21 January 2011, 04:35:18

On Wed, Jan 19, 2011 at 1:12 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> +               r = PQgetCopyData(conn, ©buf, 0);
>>> +               if (r == -1)
>>>
>>> Since -1 of PQgetCopyData might indicate an error, in this case,
>>> we would need to call PQgetResult?.
>>
>> Uh, -1 means end of data, no? -2 means error?
>
> The comment in pqGetCopyData3 says
>
>        /*
>         * On end-of-copy, exit COPY_OUT or COPY_BOTH mode and let caller
>         * read status with PQgetResult().      The normal case is that it's
>         * Copy Done, but we let parseInput read that.  If error, we expect
>         * the state was already changed.
>         */
>
> Also the comment in getCopyDataMessage says
>
>        /*
>         * If it's a legitimate async message type, process it.  (NOTIFY
>         * messages are not currently possible here, but we handle them for
>         * completeness.)  Otherwise, if it's anything except Copy Data,
>         * report end-of-copy.
>         */
>
> So I thought that. BTW, walreceiver has already done that.

When PQgetCopyData returns -1, PQgetResult should be called. This is true.
But when I read the patch again, I found that Magnus has already done that.
So my comment missed the point :( Sorry for noise.

+    res = PQgetResult(conn);
+    if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
+    {
+        fprintf(stderr, _("%s: final receive failed: %s\n"),
+                progname, PQerrorMessage(conn));
+        exit(1);
+    }

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

21 January 2011, 08:39:10

On Fri, Jan 21, 2011 at 07:02, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Fri, Jan 21, 2011 at 1:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Fujii Masao <masao.fujii@gmail.com> writes:
>>> On Thu, Jan 20, 2011 at 10:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In the case you sketch, there would be nothing to stop the (non root)
>> postgres user from renaming $PGDATA/mnt to something else and then
>> inserting his own trojan-horse directories.
>
> Hmm.. can non-root postgres user really rename the root-owned directory
> while it's being mounted?

No, but you can rename the parent directory of it, and then create
another directory inside it with the same name as the root owned
directory had.


>> Moreover, I see no positive *good* reason to do it.  There isn't
>> anyplace under $PGDATA that users should be randomly creating
>> directories, much less mount points.
>
> When taking a base backup, you don't need to take a backup of tablespaces
> separately from that of $PGDATA. You have only to take a backup of $PGDATA.

But why are you creating tablespaces in the first place, if you're
sticking them in $PGDATA?

I'd put myself in the +1 camp for "throw an error when someone tries
to create a tablespace inside $PGDATA".

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Tom Lane

Date:

21 January 2011, 11:22:09

Fujii Masao <masao.fujii@gmail.com> writes:
> Probably true. But we cannot create a tablespace for root-owned directory.
> The directory must be owned by the PostgreSQL system user. So ISTM that
> you says that creating a tablespace on a mount point itself is a security hole.

Generally, the root user would have to mount the filesystem and then
create a Postgres-owned directory under it, yes.  This is a feature not
a bug.

>> In the case you sketch, there would be nothing to stop the (non root)
>> postgres user from renaming $PGDATA/mnt to something else and then
>> inserting his own trojan-horse directories.

> Hmm.. can non-root postgres user really rename the root-owned directory
> while it's being mounted?

If you have write privilege on the parent directory, you can rename any
filesystem entry.

>> Moreover, I see no positive *good* reason to do it. �There isn't
>> anyplace under $PGDATA that users should be randomly creating
>> directories, much less mount points.

> When taking a base backup, you don't need to take a backup of tablespaces
> separately from that of $PGDATA. You have only to take a backup of $PGDATA.

Doesn't work, and doesn't tell you it didn't work, if the mount point
isn't mounted.  I believe "what happens if the secondary filesystem
isn't mounted" is exactly one of the basic reasons for the
mount-points-must-be-owned-by-root rule.  Otherwise, applications may
scribble directly on the / drive, which results in serious problems when
the mount eventually comes back.  There's an example in our archives
(from Joe Conway if memory serves) about someone destroying their
database that way.
        regards, tom lane

Re: pg_basebackup for streaming base backups

From

Magnus Hagander

Date:

23 January 2011, 07:33:38

On Thu, Jan 20, 2011 at 17:17, Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Jan 20, 2011 at 16:45, Bruce Momjian <bruce@momjian.us> wrote:
>> Do we envision pg_basebackup as something we will enahance, and if so,
>> should we consider a generic name?
>
> Well, it's certainly going to be enhanced. I think there are two main
> uses for it - backups, and setting up replication slaves. I can't see
> it expanding beyond those, really.

I've committed this with the current name, pg_basebackup, before the
bikeshed hits all the colors of the rainbow. If there's too much
uproar, we can always rename it - it's a lot easier now that we have
git :P

Base backups is something we discuss regularly, so it's not a new term.

And I don't see why people would be confused that this is a tool that
you run on the client (which can be the same machine) - afte rall, we
don't do pg_receive_dump, pg_receive_dumpall, pg_send_restore on those
tools.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: pg_basebackup for streaming base backups

From

Fujii Masao

Date:

24 January 2011, 01:38:16

On Sun, Jan 23, 2011 at 8:33 PM, Magnus Hagander <magnus@hagander.net> wrote:
> I've committed this with the current name, pg_basebackup

Great!

But, per subsequent commit logs, I should have reviewed more about
portability issues :(

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center