Thread: pg_basebackup for streaming base backups
This patch creates pg_basebackup in bin/, being a client program for the streaming base backup feature. I think it's more or less done now. I've again split it out of pg_streamrecv, because it had very little shared code with that (basically just the PQconnectdb() wrapper). One thing I'm thinking about - right now the tool just takes -c <conninfo> to connect to the database. Should it instead be taught to take the connection parameters that for example pg_dump does - one for each of host, port, user, password? (shouldn't be hard to do..) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Attachment
Hi, I have an unexpected 5 mins window to do a first reading of the patch, so here goes the quick doc and comments proof reading of it. :) Magnus Hagander <magnus@hagander.net> writes: > This patch creates pg_basebackup in bin/, being a client program for > the streaming base backup feature. Great! We have pg_ctl init[db], I think we want pg_ctl clone or some other command here to call the binary for us. What do you think? > One thing I'm thinking about - right now the tool just takes -c > <conninfo> to connect to the database. Should it instead be taught to > take the connection parameters that for example pg_dump does - one for > each of host, port, user, password? (shouldn't be hard to do..) Consistency is good. Now, basic first patch reading level review: I think doc/src/sgml/backup.sgml should include some notes about how libpq base backup streaming compares to rsync and the like in term of efficiency or typical performances, when to prefer which, etc. I'll see about doing some tests next week. + <term><option>--basedir=<replaceable class="parameter">directory</replaceable></option></term> That should be -D --pgdata, for consistency with pg_dump. On a quick reading it's unclear from the docs alone how -d and -t leave together. It seems like the options are exclusive but I'd have to ask… + * The file will be named base.tar[.gz] if it's for the main data directory + * or <tablespaceoid>.tar[.gz] if it's for another tablespace. Well we have UNIQUE, btree (spcname), so maybe we can use that here? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Sat, Jan 15, 2011 at 21:16, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Hi, > > I have an unexpected 5 mins window to do a first reading of the patch, > so here goes the quick doc and comments proof reading of it. :) :-) > Magnus Hagander <magnus@hagander.net> writes: >> This patch creates pg_basebackup in bin/, being a client program for >> the streaming base backup feature. > > Great! We have pg_ctl init[db], I think we want pg_ctl clone or some > other command here to call the binary for us. What do you think? That might be useful, but I think we need to settle on the pg_basebackup contents itself first. Not sure pg_ctl clone would be the proper name, since it's not actually a clone at this point (it might be with the second patch I ust posted that includes the WAL files) >> One thing I'm thinking about - right now the tool just takes -c >> <conninfo> to connect to the database. Should it instead be taught to >> take the connection parameters that for example pg_dump does - one for >> each of host, port, user, password? (shouldn't be hard to do..) > > Consistency is good. > > Now, basic first patch reading level review: > > I think doc/src/sgml/backup.sgml should include some notes about how > libpq base backup streaming compares to rsync and the like in term of > efficiency or typical performances, when to prefer which, etc. I'll see > about doing some tests next week. Yeah, the whole backup chapter may well need some more work after this. > + <term><option>--basedir=<replaceable class="parameter">directory</replaceable></option></term> > > That should be -D --pgdata, for consistency with pg_dump. pg_dump doesn't have a -D. I assume you mean pg_ctl / initdb? > On a quick reading it's unclear from the docs alone how -d and -t leave > together. It seems like the options are exclusive but I'd have to ask… They are. The docs clearly say "Only one of <literal>-d</> and <literal>-t</> can be specified" > + * The file will be named base.tar[.gz] if it's for the main data directory > + * or <tablespaceoid>.tar[.gz] if it's for another tablespace. > > Well we have UNIQUE, btree (spcname), so maybe we can use that here? We could, but that would make it more likely to run into encoding issues and such - do we restrict what can be in a tablespace name? Also with a tar named by the oid, you *can* untar it into a directory in pg_tblspc to recover from if you have to. Another option, I think Heikki mentioned this on IM at some point, is to do something like name it <oid>-<name>.tar. That would give us best of both worlds? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Em 15-01-2011 15:10, Magnus Hagander escreveu: > One thing I'm thinking about - right now the tool just takes -c > <conninfo> to connect to the database. Should it instead be taught to > take the connection parameters that for example pg_dump does - one for > each of host, port, user, password? (shouldn't be hard to do..) > +1. -- Euler Taveira de Oliveira http://www.timbira.com/
Magnus Hagander <magnus@hagander.net> writes: > Not sure pg_ctl clone would be the proper name, since it's not > actually a clone at this point (it might be with the second patch I > ust posted that includes the WAL files) Let's keep the clone name for the client that makes it all then :) >> That should be -D --pgdata, for consistency with pg_dump. > > pg_dump doesn't have a -D. I assume you mean pg_ctl / initdb? Yes, sorry, been too fast. > They are. The docs clearly say "Only one of <literal>-d</> and > <literal>-t</> can be specified" Really too fast… > Another option, I think Heikki mentioned this on IM at some point, is > to do something like name it <oid>-<name>.tar. That would give us best > of both worlds? Well I'd think we know the pg_tablespace columns encoding, so the problem might be the filesystem encodings, right? Well there's also the option of creating <oid>.tar and have a symlink to it called <name>.tar but that's pushing it. I don't think naming after OIDs is a good service for users, but if that's all we can reasonably do… Will continue reviewing and post something more polished and comprehensive next week — mainly wanted to see if you wanted to include pg_ctl <command> in the patch already. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Sat, Jan 15, 2011 at 23:10, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >>> That should be -D --pgdata, for consistency with pg_dump. >> >> pg_dump doesn't have a -D. I assume you mean pg_ctl / initdb? > > Yes, sorry, been too fast. Ok. Updated patch that includes this change attached. I also changed the tar directory from -t to -T, for consistency. It also includes the change to take -h host, -U user, -w/-W for password -p port instead of a conninfo string. >> Another option, I think Heikki mentioned this on IM at some point, is >> to do something like name it <oid>-<name>.tar. That would give us best >> of both worlds? > > Well I'd think we know the pg_tablespace columns encoding, so the > problem might be the filesystem encodings, right? Well there's also the Do we really? That's one of the global catalogs that don't really have an encoding, isn't it? > option of creating <oid>.tar and have a symlink to it called <name>.tar > but that's pushing it. I don't think naming after OIDs is a good > service for users, but if that's all we can reasonably do… Yeah, symlink seems to be making things way too complex. <oid>-<name> seems is perhaps a reasonable compromise? > Will continue reviewing and post something more polished and > comprehensive next week — mainly wanted to see if you wanted to include > pg_ctl <command> in the patch already. Ok, thanks. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Attachment
Magnus Hagander <magnus@hagander.net> writes: >> + * The file will be named base.tar[.gz] if it's for the main data directory >> + * or <tablespaceoid>.tar[.gz] if it's for another tablespace. >> >> Well we have UNIQUE, btree (spcname), so maybe we can use that here? > We could, but that would make it more likely to run into encoding > issues and such - do we restrict what can be in a tablespace name? No. Don't even think of going there --- we got rid of user-accessible names in the filesystem years ago and we're not going back. ConsiderCREATE TABLESPACE "/foo/bar" LOCATION '/foo/bar'; regards, tom lane
On Sun, Jan 16, 2011 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >>> + * The file will be named base.tar[.gz] if it's for the main data directory >>> + * or <tablespaceoid>.tar[.gz] if it's for another tablespace. >>> >>> Well we have UNIQUE, btree (spcname), so maybe we can use that here? > >> We could, but that would make it more likely to run into encoding >> issues and such - do we restrict what can be in a tablespace name? > > No. Don't even think of going there --- we got rid of user-accessible > names in the filesystem years ago and we're not going back. Consider > CREATE TABLESPACE "/foo/bar" LOCATION '/foo/bar'; Well, we'd try to name the file for that "<oid>-/foo/bar.tar", which I guess would break badly, yes. I guess we could normalize the tablespace name into [a-zA-Z0-9] or so, which would still be useful for the majority of cases, I think? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > On Sun, Jan 16, 2011 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> No. Don't even think of going there --- we got rid of user-accessible >> names in the filesystem years ago and we're not going back. Consider >> CREATE TABLESPACE "/foo/bar" LOCATION '/foo/bar'; > > Well, we'd try to name the file for that "<oid>-/foo/bar.tar", which I > guess would break badly, yes. > > I guess we could normalize the tablespace name into [a-zA-Z0-9] or so, > which would still be useful for the majority of cases, I think? Well if we're not using user names, there's no good choice except for system name, and the one you're making up here isn't the "true" one… Now I think the unfriendliness is around the fact that you need to prepare (untar, unzip) and start a cluster from the backup to be able to know what file contains what. Is it possible to offer a tool that lists the logical objects contained into each tar file? Maybe adding a special section at the beginning of each. That would be logically like pg_dump "catalog", but implemented as a simple "noise" file that you simply `cat` with some command. Once more, I'm still unclear how important that is, but it's scratching. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Magnus Hagander <magnus@hagander.net> writes: > Well, we'd try to name the file for that "<oid>-/foo/bar.tar", which I > guess would break badly, yes. > I guess we could normalize the tablespace name into [a-zA-Z0-9] or so, > which would still be useful for the majority of cases, I think? Just stick with the OID. There's no reason that I can see to have "friendly" names for these tarfiles --- in most cases, the DBA will never even deal with them, no? regards, tom lane
On Sun, Jan 16, 2011 at 18:59, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >> Well, we'd try to name the file for that "<oid>-/foo/bar.tar", which I >> guess would break badly, yes. > >> I guess we could normalize the tablespace name into [a-zA-Z0-9] or so, >> which would still be useful for the majority of cases, I think? > > Just stick with the OID. There's no reason that I can see to have > "friendly" names for these tarfiles --- in most cases, the DBA will > never even deal with them, no? No, this is the output mode where the DBA chooses to get the output in the form of tarfiles. So if chosen, he will definitely deal with it. When we unpack the tars right away to a directory, they have no name, so that doesn't apply here. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > On Sun, Jan 16, 2011 at 18:59, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Just stick with the OID. �There's no reason that I can see to have >> "friendly" names for these tarfiles --- in most cases, the DBA will >> never even deal with them, no? > No, this is the output mode where the DBA chooses to get the output in > the form of tarfiles. So if chosen, he will definitely deal with it. Mph. How big a use-case has that got? Offhand I can't see a reason to use it at all, ever. If you're trying to set up a clone you want the files unpacked. regards, tom lane
On Sun, Jan 16, 2011 at 19:03, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >> On Sun, Jan 16, 2011 at 18:59, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Just stick with the OID. There's no reason that I can see to have >>> "friendly" names for these tarfiles --- in most cases, the DBA will >>> never even deal with them, no? > >> No, this is the output mode where the DBA chooses to get the output in >> the form of tarfiles. So if chosen, he will definitely deal with it. > > Mph. How big a use-case has that got? Offhand I can't see a reason to > use it at all, ever. If you're trying to set up a clone you want the > files unpacked. Yes, but the tool isn't just for setting up a clone. If you're doing a regular base backup, that's *not* for replication, you might want them in files. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > If you're doing a regular base backup, that's *not* for replication, > you might want them in files. +1 So, is that pg_restore -l idea feasible with your current tar format? I guess that would translate to pg_basebackup -l <directory>|<oid>.tar. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Sun, Jan 16, 2011 at 19:21, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Magnus Hagander <magnus@hagander.net> writes: >> If you're doing a regular base backup, that's *not* for replication, >> you might want them in files. > > +1 > > So, is that pg_restore -l idea feasible with your current tar format? I > guess that would translate to pg_basebackup -l <directory>|<oid>.tar. Um, not easily if you want to translate it to names. Just like you don't have access to the oid->name mapping without the server started. The walsender can't read pg_class for example, so it can't generate that mapping file. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Sun, Jan 16, 2011 at 11:31 PM, Magnus Hagander <magnus@hagander.net> wrote: > Ok. Updated patch that includes this change attached. I could not apply the patch cleanly against the git master. Do you know what the cause is? $ patch -p1 -d. < /hoge/pg_basebackup.patch patching file doc/src/sgml/backup.sgml patching file doc/src/sgml/ref/allfiles.sgml patching file doc/src/sgml/ref/pg_basebackup.sgml patching file doc/src/sgml/reference.sgml patching file src/bin/Makefile patching file src/bin/pg_basebackup/Makefile patching file src/bin/pg_basebackup/nls.mk patching file src/bin/pg_basebackup/pg_basebackup.c patch: **** malformed patch at line 1428: diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
<p><br /> On Jan 17, 2011 9:16 AM, "Fujii Masao" <<a href="mailto:masao.fujii@gmail.com">masao.fujii@gmail.com</a>>wrote:<br /> ><br /> > On Sun, Jan 16, 2011 at 11:31PM, Magnus Hagander <<a href="mailto:magnus@hagander.net">magnus@hagander.net</a>> wrote:<br /> > > Ok.Updated patch that includes this change attached.<br /> ><br /> > I could not apply the patch cleanly against thegit master.<br /> > Do you know what the cause is?<br /> ><br /> > $ patch -p1 -d. < /hoge/pg_basebackup.patch<br/> > patching file doc/src/sgml/backup.sgml<br /> > patching file doc/src/sgml/ref/allfiles.sgml<br/> > patching file doc/src/sgml/ref/pg_basebackup.sgml<br /> > patching file doc/src/sgml/reference.sgml<br/> > patching file src/bin/Makefile<br /> > patching file src/bin/pg_basebackup/Makefile<br/> > patching file src/bin/pg_basebackup/<a href="http://nls.mk">nls.mk</a><br /> >patching file src/bin/pg_basebackup/pg_basebackup.c<br /> > patch: **** malformed patch at line 1428: diff --git<br/> > a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm<br /> ><br /><p>Weird, no idea. Will haveto look into that later - meanwhile you can grab the branch tip from my github repo if you want to review it. <p>/Magnus<br/>
On Mon, Jan 17, 2011 at 5:44 PM, Magnus Hagander <magnus@hagander.net> wrote: > Weird, no idea. Will have to look into that later - meanwhile you can grab > the branch tip from my github repo if you want to review it. Which repo should I grab? You seem to have many repos :) http://git.postgresql.org/gitweb Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Mon, Jan 17, 2011 at 09:50, Fujii Masao <masao.fujii@gmail.com> wrote: > On Mon, Jan 17, 2011 at 5:44 PM, Magnus Hagander <magnus@hagander.net> wrote: >> Weird, no idea. Will have to look into that later - meanwhile you can grab >> the branch tip from my github repo if you want to review it. > > Which repo should I grab? You seem to have many repos :) > http://git.postgresql.org/gitweb Oh, sorry about that. There is only one that contains postgresql though :P http://github.com/mhagander/postgres, branch streaming_base. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, Jan 17, 2011 at 6:53 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Mon, Jan 17, 2011 at 09:50, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Mon, Jan 17, 2011 at 5:44 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> Weird, no idea. Will have to look into that later - meanwhile you can grab >>> the branch tip from my github repo if you want to review it. >> >> Which repo should I grab? You seem to have many repos :) >> http://git.postgresql.org/gitweb > > Oh, sorry about that. There is only one that contains postgresql though :P > > http://github.com/mhagander/postgres, branch streaming_base. Thanks! Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On lör, 2011-01-15 at 19:10 +0100, Magnus Hagander wrote: > This patch creates pg_basebackup in bin/, being a client program for > the streaming base backup feature. > > I think it's more or less done now. I've again split it out of > pg_streamrecv, because it had very little shared code with that > (basically just the PQconnectdb() wrapper). > > One thing I'm thinking about - right now the tool just takes -c > <conninfo> to connect to the database. Should it instead be taught to > take the connection parameters that for example pg_dump does - one for > each of host, port, user, password? (shouldn't be hard to do..) Probably yes, for consistency. I have been thinking for a while, however, that it would be very good if the tools also supported a conninfo string, so you don't have to invent a new option for every new connection option. psql already supports that. Some other comments: I had trouble at first interpreting the documentation. In particular, where does the data come from, and where does it go to? -d speaks of restoring, but I was just looking for making a backup, not restoring it. Needs some clarification, and some complete examples. Also what happens if -c, or -d and -t are omitted. Downthread you say that this tool is also useful for making base backups independent of replication functionality. Sounds good. But then the documentation says that the connection must be with a user that has the replication permission. Something is conceptually wrong here: why would I have to grant replication permission just to take a base backup for the purpose of making a backup?
On Mon, Jan 17, 2011 at 13:38, Peter Eisentraut <peter_e@gmx.net> wrote: > On lör, 2011-01-15 at 19:10 +0100, Magnus Hagander wrote: >> This patch creates pg_basebackup in bin/, being a client program for >> the streaming base backup feature. >> >> I think it's more or less done now. I've again split it out of >> pg_streamrecv, because it had very little shared code with that >> (basically just the PQconnectdb() wrapper). >> >> One thing I'm thinking about - right now the tool just takes -c >> <conninfo> to connect to the database. Should it instead be taught to >> take the connection parameters that for example pg_dump does - one for >> each of host, port, user, password? (shouldn't be hard to do..) > > Probably yes, for consistency. I have been thinking for a while, > however, that it would be very good if the tools also supported a > conninfo string, so you don't have to invent a new option for every new > connection option. psql already supports that. libpq has an option to expand a connection string if it's passed in the database name, it seems. The problem is that this is done on the dbname parameter - won't work in pg_dump for example, without special treatment, since the db name is used in the client there. > Some other comments: > > I had trouble at first interpreting the documentation. In particular, > where does the data come from, and where does it go to? -d speaks of > restoring, but I was just looking for making a backup, not restoring it. > Needs some clarification, and some complete examples. Also what happens > if -c, or -d and -t are omitted. You get an error. (not with -c anymore) I'll look at adding some further clarifications to it. Concrete suggestions from you or others are of course also appreciated :-) > Downthread you say that this tool is also useful for making base backups > independent of replication functionality. Sounds good. But then the > documentation says that the connection must be with a user that has the > replication permission. Something is conceptually wrong here: why would > I have to grant replication permission just to take a base backup for > the purpose of making a backup? It uses the replication features for it. You also have to set max_walsenders > 0, which is in the replication section of the postgresql.conf file. The point I wanted to make downthread was that it's useful without having a replication *slave*. But yes, you need the master. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, Jan 17, 2011 at 7:14 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> Oh, sorry about that. There is only one that contains postgresql though :P >> >> http://github.com/mhagander/postgres, branch streaming_base. > > Thanks! Though I haven't seen the core part of the patch (i.e., ReceiveTarFile, etc..) yet, here is the comments against others. + if (strcmp(argv[1], "-h") == 0 || strcmp(argv[1], "--help") == 0 || + strcmp(argv[1], "-?") == 0) strcmp(argv[1], "-h") should be removed + printf(_(" -p, --progress show progress information\n")); -p needs to be changed to -P + printf(_(" -D, --pgdata=directory receive base backup into directory\n")); + printf(_(" -T, --tardir=directory receive base backup into tar files\n" + " stored in specified directory\n")); + printf(_(" -Z, --compress=0-9 compress tar output\n")); + printf(_(" -l, --label=label set backup label\n")); + printf(_(" -p, --progress show progress information\n")); + printf(_(" -v, --verbose output verbose messages\n")); Can we list those options in alphabetical order as other tools do? Like -f and -F option of pg_dump, it's more intuitive to provide one option for output directory and that for format. Something like -d directory --dir=directory -F format --format=format p plain t tar + case 'p': + if (atoi(optarg) == 0) + { + fprintf(stderr, _("%s: invalid port number \"%s\""), + progname, optarg); + exit(1); + } Shouldn't we emit an error message when the result of atoi is *less than* or equal to 0? \n should be in the tail of the error message. Is this error check really required here? IIRC libpq does. If it's required, atoi for compresslevel should also be error-checked. + case 'v': + verbose++; Why is the verbose defined as integer? + if (optind < argc) + { + fprintf(stderr, + _("%s: too many command-line arguments (first is \"%s\")\n"), + progname, argv[optind + 1]); You need to reference to argv[optind] instead. What about using PGDATA environment variable when no target directory is specified? + * Verify that the given directory exists and is empty. If it does not + * exist, it is created. If it exists but is not empty, an error will + * be give and the process ended. + */ +static void +verify_dir_is_empty_or_create(char *dirname) Since verify_dir_is_empty_or_create can be called after the connection has been established, it should call PQfinish before exit(1). + keywords[2] = "fallback_application_name"; + values[2] = "pg_basebackup"; Using the progname variable seems better rather than the fixed word "pg_basebackup". + if (dbgetpassword == 1) + { + /* Prompt for a password */ + password = simple_prompt(_("Password: "), 100, false); PQfinish should be called for the password retry case. + if (PQstatus(conn) != CONNECTION_OK) + { + fprintf(stderr, _("%s: could not connect to server: %s\n"), + progname, PQerrorMessage(conn)); + exit(1); + } PQfinish seems required before exit(1). + if (PQsendQuery(conn, current_path) == 0) + { + fprintf(stderr, _("%s: coult not start base backup: %s\n"), Typo: s/coult/could + /* + * Get the header + */ + res = PQgetResult(conn); After this, PQclear seems required before each exit(1) call. + if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) + { + fprintf(stderr, _("%s: final receive failed: %s\n"), + progname, PQerrorMessage(conn)); + exit(1); + } PQfinish seems required before exit(1). Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Mon, Jan 17, 2011 at 9:43 PM, Magnus Hagander <magnus@hagander.net> wrote: >> Probably yes, for consistency. I have been thinking for a while, >> however, that it would be very good if the tools also supported a >> conninfo string, so you don't have to invent a new option for every new >> connection option. psql already supports that. > > libpq has an option to expand a connection string if it's passed in > the database name, it seems. The problem is that this is done on the > dbname parameter - won't work in pg_dump for example, without special > treatment, since the db name is used in the client there. If conninfo is specified, you can just append the "dbname=replication" into it and pass it to libpq as dbname. I don't think that supporting conninfo is difficult. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Mon, 2011-01-17 at 13:43 +0100, Magnus Hagander wrote: > Downthread you say that this tool is also useful for making base > backups > > independent of replication functionality. Sounds good. But then > the > > documentation says that the connection must be with a user that has > the > > replication permission. Something is conceptually wrong here: why > would > > I have to grant replication permission just to take a base backup > for > > the purpose of making a backup? > > It uses the replication features for it. You also have to set > max_walsenders > 0, which is in the replication section of the > postgresql.conf file. > > The point I wanted to make downthread was that it's useful without > having a replication *slave*. But yes, you need the master. Suggest calling this utility pg_replication_backup and the other utility pg_replication_stream It will be easier to explain the connection with replication. -- Simon Riggs http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services
On Mon, Jan 17, 2011 at 14:30, Fujii Masao <masao.fujii@gmail.com> wrote: > On Mon, Jan 17, 2011 at 7:14 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >>> Oh, sorry about that. There is only one that contains postgresql though :P >>> >>> http://github.com/mhagander/postgres, branch streaming_base. >> >> Thanks! > > Though I haven't seen the core part of the patch (i.e., > ReceiveTarFile, etc..) yet, > here is the comments against others. > > > + if (strcmp(argv[1], "-h") == 0 || strcmp(argv[1], "--help") == 0 || > + strcmp(argv[1], "-?") == 0) > > strcmp(argv[1], "-h") should be removed Oh, crap. From the addition of -h for host. oopsie. > + printf(_(" -p, --progress show progress information\n")); > > -p needs to be changed to -P Indeed. > + printf(_(" -D, --pgdata=directory receive base backup into directory\n")); > + printf(_(" -T, --tardir=directory receive base backup into tar files\n" > + " stored in specified directory\n")); > + printf(_(" -Z, --compress=0-9 compress tar output\n")); > + printf(_(" -l, --label=label set backup label\n")); > + printf(_(" -p, --progress show progress information\n")); > + printf(_(" -v, --verbose output verbose messages\n")); > > Can we list those options in alphabetical order as other tools do? Certainly. But it makes more sense to have -D and -T next to each other, I think - they'd end up way elsewhere. Perhaps we need a group taht says "target"? > Like -f and -F option of pg_dump, it's more intuitive to provide one option for > output directory and that for format. Something like > > -d directory > --dir=directory > > -F format > --format=format > > p > plain > > t > tar That's another option. It would certainly make for more consistency - probably a better idea. > + case 'p': > + if (atoi(optarg) == 0) > + { > + fprintf(stderr, _("%s: invalid port number \"%s\""), > + progname, optarg); > + exit(1); > + } > > Shouldn't we emit an error message when the result of atoi is *less than* or > equal to 0? \n should be in the tail of the error message. Is this error check > really required here? IIRC libpq does. If it's required, atoi for compresslevel > should also be error-checked. Yes on all. > + case 'v': > + verbose++; > > Why is the verbose defined as integer? I envisioned multiple level of verbosity (which I had in pg_streamrecv), where multiple -v's would add things. > + if (optind < argc) > + { > + fprintf(stderr, > + _("%s: too many command-line arguments (first is \"%s\")\n"), > + progname, argv[optind + 1]); > > You need to reference to argv[optind] instead. Check. Copy/paste:o. > What about using PGDATA environment variable when no target directory is > specified? Hmm. I don't really like that. I prefer requiring it to be specified. > + * Verify that the given directory exists and is empty. If it does not > + * exist, it is created. If it exists but is not empty, an error will > + * be give and the process ended. > + */ > +static void > +verify_dir_is_empty_or_create(char *dirname) > > Since verify_dir_is_empty_or_create can be called after the connection has > been established, it should call PQfinish before exit(1). Yeah, that's a general thing - do we need to actually bother doing that? At most exit() places I haven't bothered free:ing memory or closing the connection. It's not just the PQclear() that you refer to further down - it's also all the xstrdup()ed strings for example. Do we really need to care about those before we do exit(1)? I think not. Requiring PQfinish() might be more reasonable since it will give you a log on the server if you don't, but I'm not convinced that's necessary either? <snip similar requirements> > + keywords[2] = "fallback_application_name"; > + values[2] = "pg_basebackup"; > > Using the progname variable seems better rather than the fixed word > "pg_basebackup". I don't think so - that turns into argv[0], which means that if you use the full path of the program (/usr/local/pgsql/bin/pg_basebackup for example), the entire path goes into fallback_application_name - not just the program name. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, Jan 17, 2011 at 14:43, Fujii Masao <masao.fujii@gmail.com> wrote: > On Mon, Jan 17, 2011 at 9:43 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> Probably yes, for consistency. I have been thinking for a while, >>> however, that it would be very good if the tools also supported a >>> conninfo string, so you don't have to invent a new option for every new >>> connection option. psql already supports that. >> >> libpq has an option to expand a connection string if it's passed in >> the database name, it seems. The problem is that this is done on the >> dbname parameter - won't work in pg_dump for example, without special >> treatment, since the db name is used in the client there. > > If conninfo is specified, you can just append the "dbname=replication" > into it and pass it to libpq as dbname. I don't think that supporting > conninfo is difficult. Yeah, it's easy enough for pg_basebackup. But not for example for pg_dump. (I meant for being able to use it more or less with zero modification to the current code - it can certainly be adapted to be able to deal with it) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, Jan 17, 2011 at 14:49, Simon Riggs <simon@2ndquadrant.com> wrote: > On Mon, 2011-01-17 at 13:43 +0100, Magnus Hagander wrote: >> Downthread you say that this tool is also useful for making base >> backups >> > independent of replication functionality. Sounds good. But then >> the >> > documentation says that the connection must be with a user that has >> the >> > replication permission. Something is conceptually wrong here: why >> would >> > I have to grant replication permission just to take a base backup >> for >> > the purpose of making a backup? >> >> It uses the replication features for it. You also have to set >> max_walsenders > 0, which is in the replication section of the >> postgresql.conf file. >> >> The point I wanted to make downthread was that it's useful without >> having a replication *slave*. But yes, you need the master. > > Suggest calling this utility pg_replication_backup > and the other utility pg_replication_stream > > It will be easier to explain the connection with replication. Hmm. I don't like those names at all :( But that's just me - and I don't totally hate them. So I'm opening the floor for other peoples votes :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, 2011-01-17 at 14:55 +0100, Magnus Hagander wrote: > >> > >> It uses the replication features for it. You also have to set > >> max_walsenders > 0, which is in the replication section of the > >> postgresql.conf file. > >> > >> The point I wanted to make downthread was that it's useful without > >> having a replication *slave*. But yes, you need the master. > > > > Suggest calling this utility pg_replication_backup > > and the other utility pg_replication_stream > > > > It will be easier to explain the connection with replication. > > Hmm. I don't like those names at all :( > > But that's just me - and I don't totally hate them. So I'm opening the > floor for other peoples votes :-) No problem. My point is that we should look for a name that illustrates the function more clearly. If Peter was confused, others will be also. -- Simon Riggs http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services
On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote: > Hmm. I don't like those names at all :( I agree. I don't think your original names are bad, as long as they're well-documented. I sympathize with Simon's desire to make it clear that these use the replication framework, but I really don't want the command names to be that long. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote: >> Hmm. I don't like those names at all :( > > I agree. I don't think your original names are bad, as long as > they're well-documented. I sympathize with Simon's desire to make it > clear that these use the replication framework, but I really don't > want the command names to be that long. Actually, after some IM chats, I think pg_streamrecv should be renamed, probably to pg_walstream (or pg_logstream, but pg_walstream is a lot more specific than that) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote: > On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote: > >> Hmm. I don't like those names at all :( > > > > I agree. I don't think your original names are bad, as long as > > they're well-documented. I sympathize with Simon's desire to make it > > clear that these use the replication framework, but I really don't > > want the command names to be that long. > > Actually, after some IM chats, I think pg_streamrecv should be > renamed, probably to pg_walstream (or pg_logstream, but pg_walstream > is a lot more specific than that) pg_stream_log pg_stream_backup ? -- Simon Riggs http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services
Magnus Hagander <magnus@hagander.net> writes: > The walsender can't read pg_class for example, so it can't generate > that mapping file. I don't see any way out here. So let's call <oid>.tar good enough for now… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Magnus Hagander <magnus@hagander.net> writes: > Actually, after some IM chats, I think pg_streamrecv should be > renamed, probably to pg_walstream (or pg_logstream, but pg_walstream > is a lot more specific than that) What I like about streamrecv is it's fairly clear which end of the connection it's supposed to be used on. I find "pg_basebackup" quite lacking from that perspective, and the same for the names above. Or are you proposing to merge the send and receive sides into one executable? regards, tom lane
On Mon, Jan 17, 2011 at 20:31, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >> Actually, after some IM chats, I think pg_streamrecv should be >> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream >> is a lot more specific than that) > > What I like about streamrecv is it's fairly clear which end of the > connection it's supposed to be used on. I find "pg_basebackup" > quite lacking from that perspective, and the same for the names > above. Or are you proposing to merge the send and receive sides > into one executable? No, the sending side is in walsender. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, Jan 17, 2011 at 10:50 PM, Magnus Hagander <magnus@hagander.net> wrote: >> + printf(_(" -D, --pgdata=directory receive base backup into directory\n")); >> + printf(_(" -T, --tardir=directory receive base backup into tar files\n" >> + " stored in specified directory\n")); >> + printf(_(" -Z, --compress=0-9 compress tar output\n")); >> + printf(_(" -l, --label=label set backup label\n")); >> + printf(_(" -p, --progress show progress information\n")); >> + printf(_(" -v, --verbose output verbose messages\n")); >> >> Can we list those options in alphabetical order as other tools do? > > Certainly. But it makes more sense to have -D and -T next to each > other, I think - they'd end up way elsewhere. Perhaps we need a group > taht says "target"? I agree with you if we end up choosing -D and -T for a target rather than pg_dump-like options I proposed. >> + * Verify that the given directory exists and is empty. If it does not >> + * exist, it is created. If it exists but is not empty, an error will >> + * be give and the process ended. >> + */ >> +static void >> +verify_dir_is_empty_or_create(char *dirname) >> >> Since verify_dir_is_empty_or_create can be called after the connection has >> been established, it should call PQfinish before exit(1). > > Yeah, that's a general thing - do we need to actually bother doing > that? At most exit() places I haven't bothered free:ing memory or > closing the connection. > > It's not just the PQclear() that you refer to further down - it's also > all the xstrdup()ed strings for example. Do we really need to care > about those before we do exit(1)? I think not. Probably true. The allocated memory would be free'd right after exit. > Requiring PQfinish() might be more reasonable since it will give you a > log on the server if you don't, but I'm not convinced that's necessary > either? At least it's required for each password-retry. Otherwise, previous connections remain during backup. You can see this problem easily by repeating the password input in pg_basebackup. LOG: could not send data to client: Connection reset by peer LOG: could not send data to client: Broken pipe FATAL: base backup could not send data, aborting backup As you said, if PQfinish is not called at exit(1), the above messages would be output. Those messages look ugly and should be suppressed whenever we *can*. Also I believe other tools would do that. >> + keywords[2] = "fallback_application_name"; >> + values[2] = "pg_basebackup"; >> >> Using the progname variable seems better rather than the fixed word >> "pg_basebackup". > > I don't think so - that turns into argv[0], which means that if you > use the full path of the program (/usr/local/pgsql/bin/pg_basebackup > for example), the entire path goes into fallback_application_name - > not just the program name. No. get_progname extracts the actual name. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Mon, Jan 17, 2011 at 10:30 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > Though I haven't seen the core part of the patch (i.e., > ReceiveTarFile, etc..) yet, > here is the comments against others. Here are another comments: When I untar the tar file taken by pg_basebackup, I got the following messages: $ tar xf base.tar tar: Skipping to next header tar: Archive contains obsolescent base-64 headers tar: Error exitdelayed from previous errors Is this a bug? This happens only when I create $PGDATA by using initdb -X (i.e., I relocated the pg_xlog directory elsewhere than $PGDATA). + if (compresslevel > 0) + { + snprintf(fn, sizeof(fn), "%s/%s.tar.gz", tardir, PQgetvalue(res, rownum, 0)); + ztarfile = gzopen(fn, "wb"); Though I'm not familiar with zlib, isn't gzsetparams() required here? +#ifdef HAVE_LIBZ + if (!tarfile && !ztarfile) +#else + if (!tarfile) +#endif + { + fprintf(stderr, _("%s: could not create file \"%s\": %s\n"), + progname, fn, strerror(errno)); Instead of strerror, get_gz_error seems required when using zlib. + if (!res || PQresultStatus(res) != PGRES_COPY_OUT) The check for "!res" is not needed since PQresultStatus checks that. + r = PQgetCopyData(conn, ©buf, 0); + if (r == -1) Since -1 of PQgetCopyData might indicate an error, in this case, we would need to call PQgetResult?. ReceiveTarFile seems refactorable by using GZWRITE and GZCLOSE macros. + fprintf(stderr, _("%s: could not write to file '%s': %m\n"), %m in fprintf is portable? Can't you change '%s' to \"%s\" for consistency? + /* + * Make sure we're unpacking into an empty directory + */ + verify_dir_is_empty_or_create(current_path); Can pg_basebackup take a backup of $PGDATA including a tablespace directory, without an error? The above code seems to prevent that.... + if (compresslevel <= 0) + { + fprintf(stderr, _("%s: invalid compression level \"%s\"\n"), It's better to check "compresslevel > 9" here. +/* + * Print a progress report based on the global variables. If verbose output + * is disabled, also print the current file name. Typo: s/disabled/enabled I request new option which specifies whether pg_start_backup executes immediate checkpoint or not. Currently it always executes immediate one. But I'd like to run smoothed one in busy system. What's your opinion? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
2011/1/18 Fujii Masao <masao.fujii@gmail.com>: > On Mon, Jan 17, 2011 at 10:30 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> Though I haven't seen the core part of the patch (i.e., >> ReceiveTarFile, etc..) yet, >> here is the comments against others. > > Here are another comments: > > > When I untar the tar file taken by pg_basebackup, I got the following > messages: > > $ tar xf base.tar > tar: Skipping to next header > tar: Archive contains obsolescent base-64 headers > tar: Error exit delayed from previous errors > > Is this a bug? This happens only when I create $PGDATA by using > initdb -X (i.e., I relocated the pg_xlog directory elsewhere than > $PGDATA). > > > + if (compresslevel > 0) > + { > + snprintf(fn, sizeof(fn), "%s/%s.tar.gz", tardir, PQgetvalue(res, > rownum, 0)); > + ztarfile = gzopen(fn, "wb"); > > Though I'm not familiar with zlib, isn't gzsetparams() required > here? > > > +#ifdef HAVE_LIBZ > + if (!tarfile && !ztarfile) > +#else > + if (!tarfile) > +#endif > + { > + fprintf(stderr, _("%s: could not create file \"%s\": %s\n"), > + progname, fn, strerror(errno)); > > Instead of strerror, get_gz_error seems required when using zlib. > > > + if (!res || PQresultStatus(res) != PGRES_COPY_OUT) > > The check for "!res" is not needed since PQresultStatus checks that. > > > + r = PQgetCopyData(conn, ©buf, 0); > + if (r == -1) > > Since -1 of PQgetCopyData might indicate an error, in this case, > we would need to call PQgetResult?. > > > ReceiveTarFile seems refactorable by using GZWRITE and GZCLOSE > macros. > > > + fprintf(stderr, _("%s: could not write to file '%s': %m\n"), > > %m in fprintf is portable? > > > Can't you change '%s' to \"%s\" for consistency? > > > + /* > + * Make sure we're unpacking into an empty directory > + */ > + verify_dir_is_empty_or_create(current_path); > > Can pg_basebackup take a backup of $PGDATA including a tablespace > directory, without an error? The above code seems to prevent that.... > > > + if (compresslevel <= 0) > + { > + fprintf(stderr, _("%s: invalid compression level \"%s\"\n"), > > It's better to check "compresslevel > 9" here. > > > +/* > + * Print a progress report based on the global variables. If verbose output > + * is disabled, also print the current file name. > > Typo: s/disabled/enabled > > > I request new option which specifies whether pg_start_backup > executes immediate checkpoint or not. Currently it always executes > immediate one. But I'd like to run smoothed one in busy system. > What's your opinion? > *if* it is possible, this is welcome, the checkpoint hit due to pg_start_backup is visible, even outside pg_basebasckup. (it sync everything then it blast cache memory) -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
On Tue, Jan 18, 2011 at 03:14, Fujii Masao <masao.fujii@gmail.com> wrote: > On Mon, Jan 17, 2011 at 10:50 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> + printf(_(" -D, --pgdata=directory receive base backup into directory\n")); >>> + printf(_(" -T, --tardir=directory receive base backup into tar files\n" >>> + " stored in specified directory\n")); >>> + printf(_(" -Z, --compress=0-9 compress tar output\n")); >>> + printf(_(" -l, --label=label set backup label\n")); >>> + printf(_(" -p, --progress show progress information\n")); >>> + printf(_(" -v, --verbose output verbose messages\n")); >>> >>> Can we list those options in alphabetical order as other tools do? >> >> Certainly. But it makes more sense to have -D and -T next to each >> other, I think - they'd end up way elsewhere. Perhaps we need a group >> taht says "target"? > > I agree with you if we end up choosing -D and -T for a target rather > than pg_dump-like options I proposed. Yeah. What do others think between tohse two options? -D/-T followed by directory, or -D <dir> and -F<format>? >> Requiring PQfinish() might be more reasonable since it will give you a >> log on the server if you don't, but I'm not convinced that's necessary >> either? > > At least it's required for each password-retry. Otherwise, previous > connections remain during backup. You can see this problem easily Oh yeah, I've put that one in my git branch already. > by repeating the password input in pg_basebackup. > > LOG: could not send data to client: Connection reset by peer > LOG: could not send data to client: Broken pipe > FATAL: base backup could not send data, aborting backup > > As you said, if PQfinish is not called at exit(1), the above messages > would be output. Those messages look ugly and should be > suppressed whenever we *can*. Also I believe other tools would > do that. Yeah, agreed. I'll add that, shouldn't be too hard. >>> + keywords[2] = "fallback_application_name"; >>> + values[2] = "pg_basebackup"; >>> >>> Using the progname variable seems better rather than the fixed word >>> "pg_basebackup". >> >> I don't think so - that turns into argv[0], which means that if you >> use the full path of the program (/usr/local/pgsql/bin/pg_basebackup >> for example), the entire path goes into fallback_application_name - >> not just the program name. > > No. get_progname extracts the actual name. Hmm. I see it does. I wonder what I did to make that not work. Then I agree with the change :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Tue, Jan 18, 2011 at 10:49, Fujii Masao <masao.fujii@gmail.com> wrote: > On Mon, Jan 17, 2011 at 10:30 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> Though I haven't seen the core part of the patch (i.e., >> ReceiveTarFile, etc..) yet, >> here is the comments against others. > > Here are another comments: Thanks! These are all good and useful comments! > When I untar the tar file taken by pg_basebackup, I got the following > messages: > > $ tar xf base.tar > tar: Skipping to next header > tar: Archive contains obsolescent base-64 headers > tar: Error exit delayed from previous errors > > Is this a bug? This happens only when I create $PGDATA by using > initdb -X (i.e., I relocated the pg_xlog directory elsewhere than > $PGDATA). Interesting. What version of tar and what platform? I can't reproduce that here... It certainly is a bug, that should not happen. > + if (compresslevel > 0) > + { > + snprintf(fn, sizeof(fn), "%s/%s.tar.gz", tardir, PQgetvalue(res, > rownum, 0)); > + ztarfile = gzopen(fn, "wb"); > > Though I'm not familiar with zlib, isn't gzsetparams() required > here? Uh. It certainly is! I clearly forgot it there... > +#ifdef HAVE_LIBZ > + if (!tarfile && !ztarfile) > +#else > + if (!tarfile) > +#endif > + { > + fprintf(stderr, _("%s: could not create file \"%s\": %s\n"), > + progname, fn, strerror(errno)); > > Instead of strerror, get_gz_error seems required when using zlib. Indeed it is. I think it needs to be this: #ifdef HAVE_LIBZif (compresslevel > 0 && !ztarfile){ /* Compression is in use */ fprintf(stderr, _("%s: could not createcompressed file \"%s\": %s\n"), progname, fn, get_gz_error(ztarfile)); exit(1);}else #endif{ /* Either no zlib support, or zlib support but compresslevel = 0 */ if (!tarfile) { fprintf(stderr,_("%s: could not create file \"%s\": %s\n"), progname, fn, strerror(errno)); exit(1); }} > + if (!res || PQresultStatus(res) != PGRES_COPY_OUT) > > The check for "!res" is not needed since PQresultStatus checks that. Hah. I still keep doing that from old habit. I know you've pointed that out before, with libpqwalreceiver :-) > + r = PQgetCopyData(conn, ©buf, 0); > + if (r == -1) > > Since -1 of PQgetCopyData might indicate an error, in this case, > we would need to call PQgetResult?. Uh, -1 means end of data, no? -2 means error? > ReceiveTarFile seems refactorable by using GZWRITE and GZCLOSE > macros. You mean the ones from pg_dump? I don't think so. We can't use gzwrite() with compression level 0 on the tar output, because it will still write a gz header. With pg_dump, that is ok because it's our format, but with a .tar (without .gz) I don't think it is. at least that's how I interpreted the function. > + fprintf(stderr, _("%s: could not write to file '%s': %m\n"), > > %m in fprintf is portable? Hmm. I just assumed it was because we use it elsewhere, but I now see we only really use it for ereport() stuff. Bottom line is, I don't know - perhaps it needs to be changed to use strerror()? > Can't you change '%s' to \"%s\" for consistency? Yeah, absolutely. Clearly I was way inconsistent, and it got worse with some copy/paste :-( > + /* > + * Make sure we're unpacking into an empty directory > + */ > + verify_dir_is_empty_or_create(current_path); > > Can pg_basebackup take a backup of $PGDATA including a tablespace > directory, without an error? The above code seems to prevent that.... Uh, how do you mean it woul dprevent that? It requires that the directory you're writing the tablespace to is empty or nonexistant, but that shouldn't prevent a backup, no? It will prevent you from overwriting things with your backup, but that's intentional - if you don't need the old dir, just remove it. > + if (compresslevel <= 0) > + { > + fprintf(stderr, _("%s: invalid compression level \"%s\"\n"), > > It's better to check "compresslevel > 9" here. Agreed (well, check for both of them of course) > +/* > + * Print a progress report based on the global variables. If verbose output > + * is disabled, also print the current file name. > > Typo: s/disabled/enabled Indeed. > I request new option which specifies whether pg_start_backup > executes immediate checkpoint or not. Currently it always executes > immediate one. But I'd like to run smoothed one in busy system. > What's your opinion? Yeah that sounds like a good idea. Shouldn't be too hard to do (will reuqire a backend patch as well, of course). Should we use "-f" for fast? Though that may be an unfortunate overload of the usual usecase for -f, so maybe -c <fast|slow> for "checkpoint fast/slow"? I've updated my git branch with the simple fixes, will get the bigger ones in there as soon as I've done them. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, Jan 17, 2011 at 16:27, Simon Riggs <simon@2ndquadrant.com> wrote: > On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote: >> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote: >> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote: >> >> Hmm. I don't like those names at all :( >> > >> > I agree. I don't think your original names are bad, as long as >> > they're well-documented. I sympathize with Simon's desire to make it >> > clear that these use the replication framework, but I really don't >> > want the command names to be that long. >> >> Actually, after some IM chats, I think pg_streamrecv should be >> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream >> is a lot more specific than that) > > pg_stream_log > pg_stream_backup Those seem better. Tom, would those solve your concerns about it being clear which side they are on? Or do you think you'd still risk reading them as the sending side? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Tue, Jan 18, 2011 at 12:40, Magnus Hagander <magnus@hagander.net> wrote: > On Tue, Jan 18, 2011 at 10:49, Fujii Masao <masao.fujii@gmail.com> wrote: > Yeah that sounds like a good idea. Shouldn't be too hard to do (will > reuqire a backend patch as well, of course). Should we use "-f" for > fast? Though that may be an unfortunate overload of the usual usecase > for -f, so maybe -c <fast|slow> for "checkpoint fast/slow"? Was easy, done with "-c <fast|slow>". -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Excerpts from Magnus Hagander's message of mar ene 18 08:40:50 -0300 2011: > On Tue, Jan 18, 2011 at 10:49, Fujii Masao <masao.fujii@gmail.com> wrote: > > + fprintf(stderr, _("%s: could not write to file '%s': %m\n"), > > > > %m in fprintf is portable? > > Hmm. I just assumed it was because we use it elsewhere, but I now see > we only really use it for ereport() stuff. Bottom line is, I don't > know - perhaps it needs to be changed to use strerror()? Some libc's (such as glibc) know about %m, others presumably don't (it's a GNU extension, according to my manpage). ereport does the expansion by itself, see expand_fmt_string(). Probably just using strerror() is the easiest. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Jan 18, 2011 at 14:26, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Excerpts from Magnus Hagander's message of mar ene 18 08:40:50 -0300 2011: >> On Tue, Jan 18, 2011 at 10:49, Fujii Masao <masao.fujii@gmail.com> wrote: > >> > + fprintf(stderr, _("%s: could not write to file '%s': %m\n"), >> > >> > %m in fprintf is portable? >> >> Hmm. I just assumed it was because we use it elsewhere, but I now see >> we only really use it for ereport() stuff. Bottom line is, I don't >> know - perhaps it needs to be changed to use strerror()? > > Some libc's (such as glibc) know about %m, others presumably don't (it's > a GNU extension, according to my manpage). ereport does the expansion > by itself, see expand_fmt_string(). Probably just using strerror() is > the easiest. Ok, thanks for clarifying. I've updated to use strerror(). Guess it's time for another patch, PFA :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Attachment
Excerpts from Magnus Hagander's message of mar ene 18 10:47:03 -0300 2011: > Ok, thanks for clarifying. I've updated to use strerror(). Guess it's > time for another patch, PFA :-) Thanks ... Message nitpick: + if (compresslevel > 0) + { + fprintf(stderr, + _("%s: this build does not support compression\n"), + progname); + exit(1); + } pg_dump uses the following wording: "WARNING: archive is compressed, but this installation does not support " "compression -- no data will be available\n" So perhaps yours should s/build/installation/ Also, in messages of this kind, + if (gzsetparams(ztarfile, compresslevel, Z_DEFAULT_STRATEGY) != Z_OK) + { + fprintf(stderr, _("%s: could not set compression level %i\n"), + progname, compresslevel); Shouldn't you also be emitting the gzerror()? ... oh I see you're already doing it for most gz calls. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Jan 18, 2011 at 15:49, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Excerpts from Magnus Hagander's message of mar ene 18 10:47:03 -0300 2011: > >> Ok, thanks for clarifying. I've updated to use strerror(). Guess it's >> time for another patch, PFA :-) > > Thanks ... Message nitpick: > + if (compresslevel > 0) > + { > + fprintf(stderr, > + _("%s: this build does not support compression\n"), > + progname); > + exit(1); > + } > > pg_dump uses the following wording: > > "WARNING: archive is compressed, but this installation does not support " > "compression -- no data will be available\n" > > So perhaps yours should s/build/installation/ That shows up when a the *archive* is compressed, though? There are a number of other cases that use build in the backend, such as: src/backend/utils/misc/guc.c: errmsg("assertion checking is not supported by this build"))); src/backend/utils/misc/guc.c: errmsg("Bonjour is not supported by this build"))); src/backend/utils/misc/guc.c: errmsg("SSL is not supported by this build"))); > Also, in messages of this kind, > + if (gzsetparams(ztarfile, compresslevel, Z_DEFAULT_STRATEGY) != Z_OK) > + { > + fprintf(stderr, _("%s: could not set compression level %i\n"), > + progname, compresslevel); > > Shouldn't you also be emitting the gzerror()? ... oh I see you're > already doing it for most gz calls. It's not clear from the zlib documentation I have that gzerror() works after a gzsetparams(). Do you have any docs that say differently by any chance? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: >>> Actually, after some IM chats, I think pg_streamrecv should be >>> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream >>> is a lot more specific than that) >> pg_stream_log >> pg_stream_backup > Those seem better. > Tom, would those solve your concerns about it being clear which side > they are on? Or do you think you'd still risk reading them as the > sending side? It's still totally unclear what they do. How about "pg_receive_log" etc? regards, tom lane
On Tue, Jan 18, 2011 at 17:31, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >>>> Actually, after some IM chats, I think pg_streamrecv should be >>>> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream >>>> is a lot more specific than that) > >>> pg_stream_log >>> pg_stream_backup > >> Those seem better. > >> Tom, would those solve your concerns about it being clear which side >> they are on? Or do you think you'd still risk reading them as the >> sending side? > > It's still totally unclear what they do. How about "pg_receive_log" > etc? I agree with whomever said using "wal" is better than "log" to be unambiguous. So it'd be pg_receive_wal and pg_receive_base_backup then? Votes from others? (it's easy to rename so far, so I'll keep plugging away under the name pg_basebackup based on Fujii-sans comments until such a time as we have a reasonable consensus :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
2011/1/18 Magnus Hagander <magnus@hagander.net>: > On Tue, Jan 18, 2011 at 17:31, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Magnus Hagander <magnus@hagander.net> writes: >>>>> Actually, after some IM chats, I think pg_streamrecv should be >>>>> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream >>>>> is a lot more specific than that) >> >>>> pg_stream_log >>>> pg_stream_backup >> >>> Those seem better. >> >>> Tom, would those solve your concerns about it being clear which side >>> they are on? Or do you think you'd still risk reading them as the >>> sending side? >> >> It's still totally unclear what they do. How about "pg_receive_log" >> etc? > > I agree with whomever said using "wal" is better than "log" to be unambiguous. > > So it'd be pg_receive_wal and pg_receive_base_backup then? Votes from > others? (it's easy to rename so far, so I'll keep plugging away under > the name pg_basebackup based on Fujii-sans comments until such a time > as we have a reasonable consensus :-) pg_receive_wal is good for me. pg_receive_base_backup in french base is a shortcut for database. here we backup the whole cluster, I would suggest pg_receive_cluster(_backup ?) > > > -- > Magnus Hagander > Me: http://www.hagander.net/ > Work: http://www.redpill-linpro.com/ > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
On Tue, Jan 18, 2011 at 12:03 PM, Magnus Hagander <magnus@hagander.net> wrote: > So it'd be pg_receive_wal and pg_receive_base_backup then? Votes from > others? (it's easy to rename so far, so I'll keep plugging away under > the name pg_basebackup based on Fujii-sans comments until such a time > as we have a reasonable consensus :-) I like pg_receive_wal. pg_receive_base_backup I would be inclined to shorten to pg_basebackup or pg_streambackup, but I just work here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Magnus Hagander's message of mar ene 18 11:53:55 -0300 2011: > On Tue, Jan 18, 2011 at 15:49, Alvaro Herrera > <alvherre@commandprompt.com> wrote: > > Excerpts from Magnus Hagander's message of mar ene 18 10:47:03 -0300 2011: > > > >> Ok, thanks for clarifying. I've updated to use strerror(). Guess it's > >> time for another patch, PFA :-) > > > > Thanks ... Message nitpick: > > + if (compresslevel > 0) > > + { > > + fprintf(stderr, > > + _("%s: this build does not support compression\n"), > > + progname); > > + exit(1); > > + } > > > > pg_dump uses the following wording: > > > > "WARNING: archive is compressed, but this installation does not support " > > "compression -- no data will be available\n" > > > > So perhaps yours should s/build/installation/ > > That shows up when a the *archive* is compressed, though? There are a > number of other cases that use build in the backend, such as: > src/backend/utils/misc/guc.c: errmsg("assertion checking is not > supported by this build"))); > src/backend/utils/misc/guc.c: errmsg("Bonjour is not supported by > this build"))); > src/backend/utils/misc/guc.c: errmsg("SSL is not supported by this > build"))); Hmm. I think I'd s/build/installation/ on all those messages for consistency. > > Also, in messages of this kind, > > + if (gzsetparams(ztarfile, compresslevel, Z_DEFAULT_STRATEGY) != Z_OK) > > + { > > + fprintf(stderr, _("%s: could not set compression level %i\n"), > > + progname, compresslevel); > > > > Shouldn't you also be emitting the gzerror()? ... oh I see you're > > already doing it for most gz calls. > > It's not clear from the zlib documentation I have that gzerror() works > after a gzsetparams(). Do you have any docs that say differently by > any chance? Ah, no. I was reading zlib.h, which is ambiguous as you say: ZEXTERN int ZEXPORT gzsetparams OF((gzFile file, int level, int strategy)); /* Dynamically update the compression level or strategy. See the description of deflateInit2 for the meaning of theseparameters. gzsetparams returns Z_OK if success, or Z_STREAM_ERROR if the file was not opened for writing. */ ZEXTERN const char * ZEXPORT gzerror OF((gzFile file, int *errnum)); /* Returns the error message for the last error which occurred on the given compressed file. errnum is set to zlib errornumber. If an error occurred in the file system and not in the compression library, errnum is set to Z_ERRNO and theapplication may consult errno to get the exact error code. ... but a quick look at the code says that it sets gz_stream->z_err which is what gzerror returns: int ZEXPORT gzsetparams (file, level, strategy) gzFile file; int level; int strategy; { gz_stream *s = (gz_stream*)file; if (s == NULL || s->mode != 'w') return Z_STREAM_ERROR; /* Make room to allow flushing */ if (s->stream.avail_out == 0) { s->stream.next_out = s->outbuf; if (fwrite(s->outbuf, 1, Z_BUFSIZE, s->file) != Z_BUFSIZE) { s->z_err= Z_ERRNO; } s->stream.avail_out = Z_BUFSIZE; } return deflateParams (&(s->stream), level, strategy); } -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, 2011-01-18 at 18:03 +0100, Magnus Hagander wrote: > So it'd be pg_receive_wal and pg_receive_base_backup then? OK for me. Maybe even pg_receive_wal_stream Don't see any reason why command names can't be long. We have many function names already that long. -- Simon Riggs http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services
On Tue, Jan 18, 2011 at 19:20, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Excerpts from Magnus Hagander's message of mar ene 18 11:53:55 -0300 2011: >> On Tue, Jan 18, 2011 at 15:49, Alvaro Herrera >> <alvherre@commandprompt.com> wrote: >> > Also, in messages of this kind, >> > + if (gzsetparams(ztarfile, compresslevel, Z_DEFAULT_STRATEGY) != Z_OK) >> > + { >> > + fprintf(stderr, _("%s: could not set compression level %i\n"), >> > + progname, compresslevel); >> > >> > Shouldn't you also be emitting the gzerror()? ... oh I see you're >> > already doing it for most gz calls. >> >> It's not clear from the zlib documentation I have that gzerror() works >> after a gzsetparams(). Do you have any docs that say differently by >> any chance? > > Ah, no. I was reading zlib.h, which is ambiguous as you say: > > ZEXTERN int ZEXPORT gzsetparams OF((gzFile file, int level, int strategy)); > /* > Dynamically update the compression level or strategy. See the description > of deflateInit2 for the meaning of these parameters. > gzsetparams returns Z_OK if success, or Z_STREAM_ERROR if the file was not > opened for writing. > */ > > ZEXTERN const char * ZEXPORT gzerror OF((gzFile file, int *errnum)); > /* > Returns the error message for the last error which occurred on the > given compressed file. errnum is set to zlib error number. If an > error occurred in the file system and not in the compression library, > errnum is set to Z_ERRNO and the application may consult errno > to get the exact error code. > > > ... but a quick look at the code says that it sets gz_stream->z_err > which is what gzerror returns: > > int ZEXPORT gzsetparams (file, level, strategy) > gzFile file; > int level; > int strategy; > { > gz_stream *s = (gz_stream*)file; > > if (s == NULL || s->mode != 'w') return Z_STREAM_ERROR; > > /* Make room to allow flushing */ > if (s->stream.avail_out == 0) { > > s->stream.next_out = s->outbuf; > if (fwrite(s->outbuf, 1, Z_BUFSIZE, s->file) != Z_BUFSIZE) { > s->z_err = Z_ERRNO; > } > s->stream.avail_out = Z_BUFSIZE; > } > > return deflateParams (&(s->stream), level, strategy); > } Ah, ok. I've added the errorcode now, PFA. I also fixed an error in the change for result codes I broke in the last patch. github branch updated as usual. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Attachment
On Tue, Jan 18, 2011 at 8:40 PM, Magnus Hagander <magnus@hagander.net> wrote: >> When I untar the tar file taken by pg_basebackup, I got the following >> messages: >> >> $ tar xf base.tar >> tar: Skipping to next header >> tar: Archive contains obsolescent base-64 headers >> tar: Error exit delayed from previous errors >> >> Is this a bug? This happens only when I create $PGDATA by using >> initdb -X (i.e., I relocated the pg_xlog directory elsewhere than >> $PGDATA). > > Interesting. What version of tar and what platform? I can't reproduce > that here... $ cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.4 (Tikanga) $ uname -a Linux hermes 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux $ tar --version tar (GNU tar) 1.15.1 >> + r = PQgetCopyData(conn, ©buf, 0); >> + if (r == -1) >> >> Since -1 of PQgetCopyData might indicate an error, in this case, >> we would need to call PQgetResult?. > > Uh, -1 means end of data, no? -2 means error? The comment in pqGetCopyData3 says /* * On end-of-copy, exit COPY_OUT or COPY_BOTH mode and let caller * read status with PQgetResult(). The normal caseis that it's * Copy Done, but we let parseInput read that. If error, we expect * the state was already changed. */ Also the comment in getCopyDataMessage says /* * If it's a legitimate async message type, process it. (NOTIFY * messages are not currently possible here, but we handlethem for * completeness.) Otherwise, if it's anything except Copy Data, * report end-of-copy. */ So I thought that. BTW, walreceiver has already done that. >> ReceiveTarFile seems refactorable by using GZWRITE and GZCLOSE >> macros. > > You mean the ones from pg_dump? I don't think so. We can't use > gzwrite() with compression level 0 on the tar output, because it will > still write a gz header. With pg_dump, that is ok because it's our > format, but with a .tar (without .gz) I don't think it is. Right. I withdrow the comment. >> + /* >> + * Make sure we're unpacking into an empty directory >> + */ >> + verify_dir_is_empty_or_create(current_path); >> >> Can pg_basebackup take a backup of $PGDATA including a tablespace >> directory, without an error? The above code seems to prevent that.... > > Uh, how do you mean it woul dprevent that? It requires that the > directory you're writing the tablespace to is empty or nonexistant, > but that shouldn't prevent a backup, no? It will prevent you from > overwriting things with your backup, but that's intentional - if you > don't need the old dir, just remove it. What I'm worried about is the case where a tablespace is created under the $PGDATA directory. In this case, ISTM that pg_basebackup takes the backup of $PGDATA including the tablespace directory first, and then takes the backup of the tablespace directory again. But, since the tablespace directory is not already empty, the backup of the tablespace seems to fail. > Was easy, done with "-c <fast|slow>". Thanks a lot! Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Wed, Jan 19, 2011 at 4:12 AM, Magnus Hagander <magnus@hagander.net> wrote: > Ah, ok. I've added the errorcode now, PFA. I also fixed an error in > the change for result codes I broke in the last patch. github branch > updated as usual. Great. Thanks for the quick update! Here are another comments: + * IDENTIFICATION + * src/bin/pg_basebackup.c Typo: s/"src/bin/pg_basebackup.c"/"src/bin/pg_basebackup/pg_basebackup.c" + printf(_(" -c, --checkpoint=fast|slow\n" + " set fast or slow checkpoinging\n")); Typo: s/checkpoinging/checkpointing The "fast or slow" seems to lead users to always choose "fast". Instead, what about "fast or smooth", "fast or spread" or "immediate or delayed"? You seem to have forgotten to support "--checkpoint" long option. The struct long_options needs to be updated. What if pg_basebackup receives a signal while doing a backup? For example, users might do Ctrl-C to cancel the long-running backup. We should define a signal handler and send a Terminate message to the server to cancel the backup? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Wed, Jan 19, 2011 at 06:14, Fujii Masao <masao.fujii@gmail.com> wrote: > On Wed, Jan 19, 2011 at 4:12 AM, Magnus Hagander <magnus@hagander.net> wrote: >> Ah, ok. I've added the errorcode now, PFA. I also fixed an error in >> the change for result codes I broke in the last patch. github branch >> updated as usual. > > Great. Thanks for the quick update! > > Here are another comments: > > + * IDENTIFICATION > + * src/bin/pg_basebackup.c > > Typo: s/"src/bin/pg_basebackup.c"/"src/bin/pg_basebackup/pg_basebackup.c" Oops. > + printf(_(" -c, --checkpoint=fast|slow\n" > + " set fast or slow checkpoinging\n")); > > Typo: s/checkpoinging/checkpointing > > The "fast or slow" seems to lead users to always choose "fast". Instead, > what about "fast or smooth", "fast or spread" or "immediate or delayed"? Hmm. "fast or spread" seems reasonable to me. And I want to use "fast" for the fast version, because that's what we call it when you use pg_start_backup(). I'll go change it to spread for now - it's the one I can find used in the docs. > You seem to have forgotten to support "--checkpoint" long option. > The struct long_options needs to be updated. Wow, that clearly went too fast. Fixed as wlel. > What if pg_basebackup receives a signal while doing a backup? > For example, users might do Ctrl-C to cancel the long-running backup. > We should define a signal handler and send a Terminate message > to the server to cancel the backup? Nah, we'll just disconnect and it'll deal with things that way. Just like we do with e.g. pg_dump. I don't see the need to complicate it with that. (new patch on github in 5 minutes) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Fujii Masao <masao.fujii@gmail.com> writes: > What I'm worried about is the case where a tablespace is created > under the $PGDATA directory. What would be the sense of that? If you're concerned about whether the code handles it correctly, maybe the right solution is to add code to CREATE TABLESPACE to disallow it. regards, tom lane
On Wed, Jan 19, 2011 at 9:37 PM, Magnus Hagander <magnus@hagander.net> wrote: >> The "fast or slow" seems to lead users to always choose "fast". Instead, >> what about "fast or smooth", "fast or spread" or "immediate or delayed"? > > Hmm. "fast or spread" seems reasonable to me. And I want to use "fast" > for the fast version, because that's what we call it when you use > pg_start_backup(). I'll go change it to spread for now - it's the one > I can find used in the docs. Fair enough. >> What if pg_basebackup receives a signal while doing a backup? >> For example, users might do Ctrl-C to cancel the long-running backup. >> We should define a signal handler and send a Terminate message >> to the server to cancel the backup? > > Nah, we'll just disconnect and it'll deal with things that way. Just > like we do with e.g. pg_dump. I don't see the need to complicate it > with that. Okay. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, Jan 20, 2011 at 2:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> What I'm worried about is the case where a tablespace is created >> under the $PGDATA directory. > > What would be the sense of that? If you're concerned about whether the > code handles it correctly, maybe the right solution is to add code to > CREATE TABLESPACE to disallow it. I'm not sure why that's the right solution. Why do you think that we should not create the tablespace under the $PGDATA directory? I'm not surprised that people mounts the filesystem on $PGDATA/mnt and creates the tablespace on it. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Fujii Masao <masao.fujii@gmail.com> writes: > On Thu, Jan 20, 2011 at 2:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Fujii Masao <masao.fujii@gmail.com> writes: >>> What I'm worried about is the case where a tablespace is created >>> under the $PGDATA directory. >> What would be the sense of that? �If you're concerned about whether the >> code handles it correctly, maybe the right solution is to add code to >> CREATE TABLESPACE to disallow it. > I'm not sure why that's the right solution. Why do you think that we should > not create the tablespace under the $PGDATA directory? I'm not surprised > that people mounts the filesystem on $PGDATA/mnt and creates the > tablespace on it. No? Usually, having a mount point in a non-root-owned directory is considered a Bad Thing. regards, tom lane
On Thu, Jan 20, 2011 at 10:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm not sure why that's the right solution. Why do you think that we should >> not create the tablespace under the $PGDATA directory? I'm not surprised >> that people mounts the filesystem on $PGDATA/mnt and creates the >> tablespace on it. > > No? Usually, having a mount point in a non-root-owned directory is > considered a Bad Thing. Hmm.. but ISTM we can have a root-owned mount point in $PGDATA and create a tablespace there. $ su - # mkdir $PGDATA/mnt # mount -t tmpfs tmpfs $PGDATA/mnt # exit $ mkdir $PGDATA/mnt/tblspcdir $ psql =# CREATE TABLESPACE tblspc LOCATION '$PGDATA/mnt/tblspcdir'; CREATE TABLESPACE Am I missing something? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Wed, Jan 19, 2011 at 9:37 PM, Magnus Hagander <magnus@hagander.net> wrote: >> Great. Thanks for the quick update! >> >> Here are another comments: Here are comments against the documents. The other code looks good. It's helpful to document what to set to allow pg_basebackup connection. That is not only the REPLICATION privilege but also max_wal_senders and pg_hba.conf. + <refsect1> + <title>Options</title> Can we list the descriptions of option in the same order as "pg_basebackup --help" does? It's helpful to document that the target directory must be specified and it must be empty. + <para> + The backup will include all files in the data directory and tablespaces, + including the configuration files and any additional files placed in the + directory by third parties. Only regular files and directories are allowed + in the data directory, no symbolic links or special device files. The latter sentence means that the backup of the database cluster created by initdb -X is not supported? Because the symlink to the actual WAL directory is included in it. OTOH, I found the following source code comments: + * Receive a tar format stream from the connection to the server, and unpack + * the contents of it into a directory. Only files, directories and + * symlinks are supported, no other kinds of special files. This says that symlinks are supported. Which is true? Is the symlink supported only in tar format? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, Jan 20, 2011 at 05:23, Fujii Masao <masao.fujii@gmail.com> wrote: > On Wed, Jan 19, 2011 at 9:37 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> Great. Thanks for the quick update! >>> >>> Here are another comments: > > Here are comments against the documents. The other code looks good. Thanks! > It's helpful to document what to set to allow pg_basebackup connection. > That is not only the REPLICATION privilege but also max_wal_senders and > pg_hba.conf. Hmm. Yeha, i guess that wouldn't hurt. Will add that. > + <refsect1> > + <title>Options</title> > > Can we list the descriptions of option in the same order as > "pg_basebackup --help" does? > > It's helpful to document that the target directory must be specified and > it must be empty. Yeah, that's on the list - I just wanted to make any other changes first before I did that. I based on (no further) feedback and a few extra questions, I'm going to change it per your suggestion to use -D <dir> -F <format>, instead of -D/-T, which will change that stuff anyway. So I'll reorder them at that time. > + <para> > + The backup will include all files in the data directory and tablespaces, > + including the configuration files and any additional files placed in the > + directory by third parties. Only regular files and directories are allowed > + in the data directory, no symbolic links or special device files. > > The latter sentence means that the backup of the database cluster > created by initdb -X is not supported? Because the symlink to the > actual WAL directory is included in it. No, it's not. pg_xlog is specifically excluded, and sent as an empty directory, so upon restore you will have an empty pg_xlog directory. > OTOH, I found the following source code comments: > > + * Receive a tar format stream from the connection to the server, and unpack > + * the contents of it into a directory. Only files, directories and > + * symlinks are supported, no other kinds of special files. > > This says that symlinks are supported. Which is true? Is the symlink > supported only in tar format? That's actually a *backend* side restriction. If there is a symlink anywhere other than pg_tblspc in the data directory, we simply won't send it across (with a warning). The frontend code supports creating symlinks, both in directory format and in tar format (actually, in tar format it doesn't do anything, of course, it just lets it through) It wouldn't actually be hard to allow the inclusion of symlinks in the backend side. But it would make verification a lot harder - for example, if someone symlinked out pg_clog (as an example), we'd back up the symlink but not the actual files since they're not actually registered as a tablespace. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Thu, Jan 20, 2011 at 12:42, Magnus Hagander <magnus@hagander.net> wrote: > On Thu, Jan 20, 2011 at 05:23, Fujii Masao <masao.fujii@gmail.com> wrote: >> It's helpful to document what to set to allow pg_basebackup connection. >> That is not only the REPLICATION privilege but also max_wal_senders and >> pg_hba.conf. > > Hmm. Yeha, i guess that wouldn't hurt. Will add that. Added, see github branch. >> + <refsect1> >> + <title>Options</title> >> >> Can we list the descriptions of option in the same order as >> "pg_basebackup --help" does? >> >> It's helpful to document that the target directory must be specified and >> it must be empty. > > Yeah, that's on the list - I just wanted to make any other changes > first before I did that. I based on (no further) feedback and a few > extra questions, I'm going to change it per your suggestion to use -D > <dir> -F <format>, instead of -D/-T, which will change that stuff > anyway. So I'll reorder them at that time. Updated on github. >> + <para> >> + The backup will include all files in the data directory and tablespaces, >> + including the configuration files and any additional files placed in the >> + directory by third parties. Only regular files and directories are allowed >> + in the data directory, no symbolic links or special device files. >> >> The latter sentence means that the backup of the database cluster >> created by initdb -X is not supported? Because the symlink to the >> actual WAL directory is included in it. > > No, it's not. pg_xlog is specifically excluded, and sent as an empty > directory, so upon restore you will have an empty pg_xlog directory. Actually, when I verified that statement, I found a bug where we sent the wrong thing if pg_xlog was a symlink, leading to a corrupt tarfile! Patch is in the github branch. >> OTOH, I found the following source code comments: >> >> + * Receive a tar format stream from the connection to the server, and unpack >> + * the contents of it into a directory. Only files, directories and >> + * symlinks are supported, no other kinds of special files. >> >> This says that symlinks are supported. Which is true? Is the symlink >> supported only in tar format? > > That's actually a *backend* side restriction. If there is a symlink > anywhere other than pg_tblspc in the data directory, we simply won't > send it across (with a warning). > > The frontend code supports creating symlinks, both in directory format > and in tar format (actually, in tar format it doesn't do anything, of > course, it just lets it through) > > It wouldn't actually be hard to allow the inclusion of symlinks in the > backend side. But it would make verification a lot harder - for > example, if someone symlinked out pg_clog (as an example), we'd back > up the symlink but not the actual files since they're not actually > registered as a tablespace. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander wrote: > On Mon, Jan 17, 2011 at 16:27, Simon Riggs <simon@2ndquadrant.com> wrote: > > On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote: > >> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote: > >> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote: > >> >> Hmm. I don't like those names at all :( > >> > > >> > I agree. ?I don't think your original names are bad, as long as > >> > they're well-documented. ?I sympathize with Simon's desire to make it > >> > clear that these use the replication framework, but I really don't > >> > want the command names to be that long. > >> > >> Actually, after some IM chats, I think pg_streamrecv should be > >> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream > >> is a lot more specific than that) > > > > pg_stream_log > > pg_stream_backup > > Those seem better. > > Tom, would those solve your concerns about it being clear which side > they are on? Or do you think you'd still risk reading them as the > sending side? It seems pg_create_backup would be the most natural because we already have pg_start_backup and pg_stop_backup. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Thu, Jan 20, 2011 at 10:01 AM, Bruce Momjian <bruce@momjian.us> wrote: > Magnus Hagander wrote: >> On Mon, Jan 17, 2011 at 16:27, Simon Riggs <simon@2ndquadrant.com> wrote: >> > On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote: >> >> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote: >> >> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote: >> >> >> Hmm. I don't like those names at all :( >> >> > >> >> > I agree. ?I don't think your original names are bad, as long as >> >> > they're well-documented. ?I sympathize with Simon's desire to make it >> >> > clear that these use the replication framework, but I really don't >> >> > want the command names to be that long. >> >> >> >> Actually, after some IM chats, I think pg_streamrecv should be >> >> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream >> >> is a lot more specific than that) >> > >> > pg_stream_log >> > pg_stream_backup >> >> Those seem better. >> >> Tom, would those solve your concerns about it being clear which side >> they are on? Or do you think you'd still risk reading them as the >> sending side? > > It seems pg_create_backup would be the most natural because we already > have pg_start_backup and pg_stop_backup. Uh, wow. That's really mixing apples and oranges. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > On Thu, Jan 20, 2011 at 10:01 AM, Bruce Momjian <bruce@momjian.us> wrote: > > Magnus Hagander wrote: > >> On Mon, Jan 17, 2011 at 16:27, Simon Riggs <simon@2ndquadrant.com> wrote: > >> > On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote: > >> >> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote: > >> >> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote: > >> >> >> Hmm. I don't like those names at all :( > >> >> > > >> >> > I agree. ?I don't think your original names are bad, as long as > >> >> > they're well-documented. ?I sympathize with Simon's desire to make it > >> >> > clear that these use the replication framework, but I really don't > >> >> > want the command names to be that long. > >> >> > >> >> Actually, after some IM chats, I think pg_streamrecv should be > >> >> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream > >> >> is a lot more specific than that) > >> > > >> > pg_stream_log > >> > pg_stream_backup > >> > >> Those seem better. > >> > >> Tom, would those solve your concerns about it being clear which side > >> they are on? Or do you think you'd still risk reading them as the > >> sending side? > > > > It seems pg_create_backup would be the most natural because we already > > have pg_start_backup and pg_stop_backup. > > Uh, wow. That's really mixing apples and oranges. I read the description as: + You can also use the <xref linkend="app-pgbasebackup"> tool to take + the backup, instead of manually copying the files. This tool will take + care of the <function>pg_start_backup()</>, copy and + <function>pg_stop_backup()</> steps automatically, and transfers the + backup over a regular <productname>PostgreSQL</productname> connection + using the replication protocol, instead of requiring filesystem level + access. so I thought, well it does pg_start_backup and pg_stop_backup, and also creates the data directory. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Thu, Jan 20, 2011 at 10:15 AM, Bruce Momjian <bruce@momjian.us> wrote: > Robert Haas wrote: >> On Thu, Jan 20, 2011 at 10:01 AM, Bruce Momjian <bruce@momjian.us> wrote: >> > Magnus Hagander wrote: >> >> On Mon, Jan 17, 2011 at 16:27, Simon Riggs <simon@2ndquadrant.com> wrote: >> >> > On Mon, 2011-01-17 at 16:20 +0100, Magnus Hagander wrote: >> >> >> On Mon, Jan 17, 2011 at 16:18, Robert Haas <robertmhaas@gmail.com> wrote: >> >> >> > On Mon, Jan 17, 2011 at 8:55 AM, Magnus Hagander <magnus@hagander.net> wrote: >> >> >> >> Hmm. I don't like those names at all :( >> >> >> > >> >> >> > I agree. ?I don't think your original names are bad, as long as >> >> >> > they're well-documented. ?I sympathize with Simon's desire to make it >> >> >> > clear that these use the replication framework, but I really don't >> >> >> > want the command names to be that long. >> >> >> >> >> >> Actually, after some IM chats, I think pg_streamrecv should be >> >> >> renamed, probably to pg_walstream (or pg_logstream, but pg_walstream >> >> >> is a lot more specific than that) >> >> > >> >> > pg_stream_log >> >> > pg_stream_backup >> >> >> >> Those seem better. >> >> >> >> Tom, would those solve your concerns about it being clear which side >> >> they are on? Or do you think you'd still risk reading them as the >> >> sending side? >> > >> > It seems pg_create_backup would be the most natural because we already >> > have pg_start_backup and pg_stop_backup. >> >> Uh, wow. That's really mixing apples and oranges. > > I read the description as: > > + You can also use the <xref linkend="app-pgbasebackup"> tool to take > + the backup, instead of manually copying the files. This tool will take > + care of the <function>pg_start_backup()</>, copy and > + <function>pg_stop_backup()</> steps automatically, and transfers the > + backup over a regular <productname>PostgreSQL</productname> connection > + using the replication protocol, instead of requiring filesystem level > + access. > > so I thought, well it does pg_start_backup and pg_stop_backup, and also > creates the data directory. Yeah, but pg_start_backup() and pg_stop_backup() are server functions, and this is an application. Also, it won't actually work unless the server has replication configured (wal_level!=minimal, max_wal_senders>0, and possibly some setting for wal_keep_segments), which has been the main point of the naming discussion thus far. Now, you know what would be REALLY cool? Making this work without any special advance configuration. Like if we somehow figured out a way to make max_wal_senders unnecessary, and a way to change wal_level without bouncing the server, so that we could temporarily boost the WAL level from minimal to archive if someone's running a backup. That, however, is not going to happen for 9.1... but it would be *really* cool. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > > I read the description as: > > > > + ? ?You can also use the <xref linkend="app-pgbasebackup"> tool to take > > + ? ?the backup, instead of manually copying the files. This tool will take > > + ? ?care of the <function>pg_start_backup()</>, copy and > > + ? ?<function>pg_stop_backup()</> steps automatically, and transfers the > > + ? ?backup over a regular <productname>PostgreSQL</productname> connection > > + ? ?using the replication protocol, instead of requiring filesystem level > > + ? ?access. > > > > so I thought, well it does pg_start_backup and pg_stop_backup, and also > > creates the data directory. > > Yeah, but pg_start_backup() and pg_stop_backup() are server functions, > and this is an application. > > Also, it won't actually work unless the server has replication > configured (wal_level!=minimal, max_wal_senders>0, and possibly some > setting for wal_keep_segments), which has been the main point of the > naming discussion thus far. Now, you know what would be REALLY cool? > Making this work without any special advance configuration. Like if > we somehow figured out a way to make max_wal_senders unnecessary, and > a way to change wal_level without bouncing the server, so that we > could temporarily boost the WAL level from minimal to archive if > someone's running a backup. > > That, however, is not going to happen for 9.1... but it would be > *really* cool. Well, when we originally implemented PITR, we could have found a way to avoid using SQL commands to start/stop backup, but we envisioned that we would want to hook things on to those commands so we created a stable API that we could improve, and we have. Do we envision pg_basebackup as something we will enahance, and if so, should we consider a generic name? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Fujii Masao <masao.fujii@gmail.com> writes: > On Thu, Jan 20, 2011 at 10:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I'm not sure why that's the right solution. Why do you think that we should >>> not create the tablespace under the $PGDATA directory? I'm not surprised >>> that people mounts the filesystem on $PGDATA/mnt and creates the >>> tablespace on it. >> No? �Usually, having a mount point in a non-root-owned directory is >> considered a Bad Thing. > Hmm.. but ISTM we can have a root-owned mount point in $PGDATA > and create a tablespace there. Nonsense. The more general statement is that it's a security hole unless the mount point *and everything above it* is root owned. In the case you sketch, there would be nothing to stop the (non root) postgres user from renaming $PGDATA/mnt to something else and then inserting his own trojan-horse directories. Given that nobody except postgres and root could get to the mount point, maybe there wouldn't be any really serious problems caused that way --- but I still say that it's bad practice that no competent sysadmin would accept. Moreover, I see no positive *good* reason to do it. There isn't anyplace under $PGDATA that users should be randomly creating directories, much less mount points. regards, tom lane
On Thu, Jan 20, 2011 at 16:45, Bruce Momjian <bruce@momjian.us> wrote: > Robert Haas wrote: >> > I read the description as: >> > >> > + ? ?You can also use the <xref linkend="app-pgbasebackup"> tool to take >> > + ? ?the backup, instead of manually copying the files. This tool will take >> > + ? ?care of the <function>pg_start_backup()</>, copy and >> > + ? ?<function>pg_stop_backup()</> steps automatically, and transfers the >> > + ? ?backup over a regular <productname>PostgreSQL</productname> connection >> > + ? ?using the replication protocol, instead of requiring filesystem level >> > + ? ?access. >> > >> > so I thought, well it does pg_start_backup and pg_stop_backup, and also >> > creates the data directory. >> >> Yeah, but pg_start_backup() and pg_stop_backup() are server functions, >> and this is an application. >> >> Also, it won't actually work unless the server has replication >> configured (wal_level!=minimal, max_wal_senders>0, and possibly some >> setting for wal_keep_segments), which has been the main point of the >> naming discussion thus far. Now, you know what would be REALLY cool? >> Making this work without any special advance configuration. Like if >> we somehow figured out a way to make max_wal_senders unnecessary, and >> a way to change wal_level without bouncing the server, so that we >> could temporarily boost the WAL level from minimal to archive if >> someone's running a backup. >> >> That, however, is not going to happen for 9.1... but it would be >> *really* cool. > > Well, when we originally implemented PITR, we could have found a way to > avoid using SQL commands to start/stop backup, but we envisioned that we > would want to hook things on to those commands so we created a stable > API that we could improve, and we have. Yeah, we're certianly not taking those *away*. > Do we envision pg_basebackup as something we will enahance, and if so, > should we consider a generic name? Well, it's certainly going to be enhanced. I think there are two main uses for it - backups, and setting up replication slaves. I can't see it expanding beyond those, really. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Robert Haas <robertmhaas@gmail.com> writes: > Also, it won't actually work unless the server has replication > configured (wal_level!=minimal, max_wal_senders>0, and possibly some > setting for wal_keep_segments), which has been the main point of the > naming discussion thus far. Now, you know what would be REALLY cool? > Making this work without any special advance configuration. Like if > we somehow figured out a way to make max_wal_senders unnecessary, and > a way to change wal_level without bouncing the server, so that we > could temporarily boost the WAL level from minimal to archive if > someone's running a backup. Not using max_wal_senders we're on our way, you "just" have to use the external walreceiver that Magnus the code for already. WAL level, I don't know that we have that already, but a big part of what this base backup tool is useful for is preparing a standby… so certainly you want to change that setup there. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Thu, Jan 20, 2011 at 11:59 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Also, it won't actually work unless the server has replication >> configured (wal_level!=minimal, max_wal_senders>0, and possibly some >> setting for wal_keep_segments), which has been the main point of the >> naming discussion thus far. Now, you know what would be REALLY cool? >> Making this work without any special advance configuration. Like if >> we somehow figured out a way to make max_wal_senders unnecessary, and >> a way to change wal_level without bouncing the server, so that we >> could temporarily boost the WAL level from minimal to archive if >> someone's running a backup. > > Not using max_wal_senders we're on our way, you "just" have to use the > external walreceiver that Magnus the code for already. WAL level, I > don't know that we have that already, but a big part of what this base > backup tool is useful for is preparing a standby… so certainly you want > to change that setup there. Well, yeah, but it would be nice to also use it just to take a regular old backup on a system that doesn't otherwise need replication. I think that the basic problem with wal_level is that to increase it you need to somehow ensure that all the backends have the new setting, and then checkpoint. Right now, the backends get the value through the GUC machinery, and so there's no particular bound on how long it could take for them to pick up the new value. I think if we could find some way of making sure that the backends got the new value in a reasonably timely fashion, we'd be pretty close to being able to do this. But it's hard to see how to do that. I had some vague idea of creating a mechanism for broadcasting critical parameter changes. You'd make a structure in shared memory containing the "canonical" values of wal_level and all other critical variables, and the structure would also contain a 64-bit counter. Whenever you want to make a parameter change, you lock the structure, make your change, bump the counter, and release the lock. Then, there's a second structure, also in shared memory, where backends report the value that the counter had the last time they updated their local copies of the structure from the shared structure. You can watch that to find out when everyone's guaranteed to have the new value. If someone doesn't respond quickly enough, you could send them a signal to get them moving. What would really be ideal is if you could actually make this safe enough that the interrupt service routine could do all the work, rather than just setting a flag. Or maybe CHECK_FOR_INTERRUPTS(). If you can't make it safe enough to put it in someplace pretty low-level like that, the whole idea might fall apart, because it wouldn't be useful to have a way of doing this that mostly works except sometimes it just sits there and hangs for a really long time. All pie in the sky at this point... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I think that the basic problem with wal_level is that to increase it > you need to somehow ensure that all the backends have the new setting, > and then checkpoint. Right now, the backends get the value through > the GUC machinery, and so there's no particular bound on how long it > could take for them to pick up the new value. I think if we could > find some way of making sure that the backends got the new value in a > reasonably timely fashion, we'd be pretty close to being able to do > this. But it's hard to see how to do that. Well, you just said when to force the "reload" to take effect: at checkpoint time. IIRC we already multiplex SIGUSR1, is that possible to add that behavior here? And signal every backend at checkpoint time when wal_level has changed? > I had some vague idea of creating a mechanism for broadcasting > critical parameter changes. You'd make a structure in shared memory > containing the "canonical" values of wal_level and all other critical > variables, and the structure would also contain a 64-bit counter. > Whenever you want to make a parameter change, you lock the structure, > make your change, bump the counter, and release the lock. Then, > there's a second structure, also in shared memory, where backends > report the value that the counter had the last time they updated their > local copies of the structure from the shared structure. You can > watch that to find out when everyone's guaranteed to have the new > value. If someone doesn't respond quickly enough, you could send them > a signal to get them moving. What would really be ideal is if you > could actually make this safe enough that the interrupt service > routine could do all the work, rather than just setting a flag. Or > maybe CHECK_FOR_INTERRUPTS(). If you can't make it safe enough to put > it in someplace pretty low-level like that, the whole idea might fall > apart, because it wouldn't be useful to have a way of doing this that > mostly works except sometimes it just sits there and hangs for a > really long time. > > All pie in the sky at this point... Unless we manage to simplify enough the idea to have wal_level SIGHUP. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Thu, Jan 20, 2011 at 2:10 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I think that the basic problem with wal_level is that to increase it >> you need to somehow ensure that all the backends have the new setting, >> and then checkpoint. Right now, the backends get the value through >> the GUC machinery, and so there's no particular bound on how long it >> could take for them to pick up the new value. I think if we could >> find some way of making sure that the backends got the new value in a >> reasonably timely fashion, we'd be pretty close to being able to do >> this. But it's hard to see how to do that. > > Well, you just said when to force the "reload" to take effect: at > checkpoint time. IIRC we already multiplex SIGUSR1, is that possible to > add that behavior here? And signal every backend at checkpoint time > when wal_level has changed? Sending them a signal seems like a promising approach, but the trick is guaranteeing that they've actually acted on it before you start the checkpoint. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Sending them a signal seems like a promising approach, but the trick > is guaranteeing that they've actually acted on it before you start the > checkpoint. How much using a latch here would help? Or be overkill? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 20.01.2011 22:15, Dimitri Fontaine wrote: > Robert Haas<robertmhaas@gmail.com> writes: >> Sending them a signal seems like a promising approach, but the trick >> is guaranteeing that they've actually acted on it before you start the >> checkpoint. > > How much using a latch here would help? Or be overkill? A latch doesn't give you an acknowledgment from the backends that they've received and acted on the guc change. You could use it as a building block to construct that, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Jan 20, 2011 at 2:10 PM, Dimitri Fontaine > <dimitri@2ndquadrant.fr> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> I think that the basic problem with wal_level is that to increase it >>> you need to somehow ensure that all the backends have the new setting, >>> and then checkpoint. >> >> Well, you just said when to force the "reload" to take effect: at >> checkpoint time. �IIRC we already multiplex SIGUSR1, is that possible to >> add that behavior here? �And signal every backend at checkpoint time >> when wal_level has changed? > Sending them a signal seems like a promising approach, but the trick > is guaranteeing that they've actually acted on it before you start the > checkpoint. Have the backends show their current wal_level in their PGPROC entries. Sleep till they're all reporting the right thing, then fire checkpoint. regards, tom lane
On Fri, Jan 21, 2011 at 1:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> On Thu, Jan 20, 2011 at 10:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> I'm not sure why that's the right solution. Why do you think that we should >>>> not create the tablespace under the $PGDATA directory? I'm not surprised >>>> that people mounts the filesystem on $PGDATA/mnt and creates the >>>> tablespace on it. > >>> No? Usually, having a mount point in a non-root-owned directory is >>> considered a Bad Thing. > >> Hmm.. but ISTM we can have a root-owned mount point in $PGDATA >> and create a tablespace there. > > Nonsense. The more general statement is that it's a security hole > unless the mount point *and everything above it* is root owned. Probably true. But we cannot create a tablespace for root-owned directory. The directory must be owned by the PostgreSQL system user. So ISTM that you says that creating a tablespace on a mount point itself is a security hole. > In the case you sketch, there would be nothing to stop the (non root) > postgres user from renaming $PGDATA/mnt to something else and then > inserting his own trojan-horse directories. Hmm.. can non-root postgres user really rename the root-owned directory while it's being mounted? > Moreover, I see no positive *good* reason to do it. There isn't > anyplace under $PGDATA that users should be randomly creating > directories, much less mount points. When taking a base backup, you don't need to take a backup of tablespaces separately from that of $PGDATA. You have only to take a backup of $PGDATA. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Wed, Jan 19, 2011 at 1:12 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >>> + r = PQgetCopyData(conn, ©buf, 0); >>> + if (r == -1) >>> >>> Since -1 of PQgetCopyData might indicate an error, in this case, >>> we would need to call PQgetResult?. >> >> Uh, -1 means end of data, no? -2 means error? > > The comment in pqGetCopyData3 says > > /* > * On end-of-copy, exit COPY_OUT or COPY_BOTH mode and let caller > * read status with PQgetResult(). The normal case is that it's > * Copy Done, but we let parseInput read that. If error, we expect > * the state was already changed. > */ > > Also the comment in getCopyDataMessage says > > /* > * If it's a legitimate async message type, process it. (NOTIFY > * messages are not currently possible here, but we handle them for > * completeness.) Otherwise, if it's anything except Copy Data, > * report end-of-copy. > */ > > So I thought that. BTW, walreceiver has already done that. When PQgetCopyData returns -1, PQgetResult should be called. This is true. But when I read the patch again, I found that Magnus has already done that. So my comment missed the point :( Sorry for noise. + res = PQgetResult(conn); + if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) + { + fprintf(stderr, _("%s: final receive failed: %s\n"), + progname, PQerrorMessage(conn)); + exit(1); + } Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Fri, Jan 21, 2011 at 07:02, Fujii Masao <masao.fujii@gmail.com> wrote: > On Fri, Jan 21, 2011 at 1:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Fujii Masao <masao.fujii@gmail.com> writes: >>> On Thu, Jan 20, 2011 at 10:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> In the case you sketch, there would be nothing to stop the (non root) >> postgres user from renaming $PGDATA/mnt to something else and then >> inserting his own trojan-horse directories. > > Hmm.. can non-root postgres user really rename the root-owned directory > while it's being mounted? No, but you can rename the parent directory of it, and then create another directory inside it with the same name as the root owned directory had. >> Moreover, I see no positive *good* reason to do it. There isn't >> anyplace under $PGDATA that users should be randomly creating >> directories, much less mount points. > > When taking a base backup, you don't need to take a backup of tablespaces > separately from that of $PGDATA. You have only to take a backup of $PGDATA. But why are you creating tablespaces in the first place, if you're sticking them in $PGDATA? I'd put myself in the +1 camp for "throw an error when someone tries to create a tablespace inside $PGDATA". -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Fujii Masao <masao.fujii@gmail.com> writes: > Probably true. But we cannot create a tablespace for root-owned directory. > The directory must be owned by the PostgreSQL system user. So ISTM that > you says that creating a tablespace on a mount point itself is a security hole. Generally, the root user would have to mount the filesystem and then create a Postgres-owned directory under it, yes. This is a feature not a bug. >> In the case you sketch, there would be nothing to stop the (non root) >> postgres user from renaming $PGDATA/mnt to something else and then >> inserting his own trojan-horse directories. > Hmm.. can non-root postgres user really rename the root-owned directory > while it's being mounted? If you have write privilege on the parent directory, you can rename any filesystem entry. >> Moreover, I see no positive *good* reason to do it. �There isn't >> anyplace under $PGDATA that users should be randomly creating >> directories, much less mount points. > When taking a base backup, you don't need to take a backup of tablespaces > separately from that of $PGDATA. You have only to take a backup of $PGDATA. Doesn't work, and doesn't tell you it didn't work, if the mount point isn't mounted. I believe "what happens if the secondary filesystem isn't mounted" is exactly one of the basic reasons for the mount-points-must-be-owned-by-root rule. Otherwise, applications may scribble directly on the / drive, which results in serious problems when the mount eventually comes back. There's an example in our archives (from Joe Conway if memory serves) about someone destroying their database that way. regards, tom lane
On Thu, Jan 20, 2011 at 17:17, Magnus Hagander <magnus@hagander.net> wrote: > On Thu, Jan 20, 2011 at 16:45, Bruce Momjian <bruce@momjian.us> wrote: >> Do we envision pg_basebackup as something we will enahance, and if so, >> should we consider a generic name? > > Well, it's certainly going to be enhanced. I think there are two main > uses for it - backups, and setting up replication slaves. I can't see > it expanding beyond those, really. I've committed this with the current name, pg_basebackup, before the bikeshed hits all the colors of the rainbow. If there's too much uproar, we can always rename it - it's a lot easier now that we have git :P Base backups is something we discuss regularly, so it's not a new term. And I don't see why people would be confused that this is a tool that you run on the client (which can be the same machine) - afte rall, we don't do pg_receive_dump, pg_receive_dumpall, pg_send_restore on those tools. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Sun, Jan 23, 2011 at 8:33 PM, Magnus Hagander <magnus@hagander.net> wrote: > I've committed this with the current name, pg_basebackup Great! But, per subsequent commit logs, I should have reviewed more about portability issues :( Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center