Thread: Add pg_basebackup single tar output format

Add pg_basebackup single tar output format

From
Josh Elsasser
Date:
Hi. I have a need to pipe the output from pg_basebackup for a
multi-tablespace cluster into another program without spooling to
disk. Seeing as the current -F tar output format can't do that, I've
made an attempt at implementing that myself.

As a side effect I've refactored the some of the pg_basebackup code
for readability and reusability, as well as implemented support for
longer filenames in tar output (not used by default for
compatibility). There is also a fix for a bug where pg_basebackup will
drop a zero-length file at the end of a tablespace's tar stream.

I've put my changes up as a series of relatively small commits on this
branch of a github fork:

https://github.com/jre/postgres/commits/single-tar

Comments and suggestions are welcome.



Re: Add pg_basebackup single tar output format

From
Andres Freund
Date:
Hi Josh,

On 2015-09-29 14:38:11 -0700, Josh Elsasser wrote:
> As a side effect I've refactored the some of the pg_basebackup code
> for readability and reusability

Cool, that's desperately needed. I've been trying to bug Magnus into
doing that for a bunch of conferences now ;)

> I've put my changes up as a series of relatively small commits on this
> branch of a github fork:
> 
> https://github.com/jre/postgres/commits/single-tar
> 
> Comments and suggestions are welcome.

Please post them to the list, we want patches to be in our archives.

Regards

Andres



[PATCH 1/6] Add support for longer filenames in tar headers (up to 254 bytes).

From
Joshua Elsasser
Date:
New functions tarHeaderRename() and tarHeaderGetName() are exposed to
store and retrieve the longer filenames.

tarCreateHeader() continues to limit filenames to 99 bytes to preserve
compatability with existing consumers.
---src/include/pgtar.h |   2 +src/port/tar.c      | 134 ++++++++++++++++++++++++++++++++++++++++++++++------2 files
changed,121 insertions(+), 15 deletions(-)
 

diff --git a/src/include/pgtar.h b/src/include/pgtar.h
index 906db7c..b1c68fc 100644
--- a/src/include/pgtar.h
+++ b/src/include/pgtar.h
@@ -20,4 +20,6 @@ enum tarError};extern enum tarError tarCreateHeader(char *h, const char *filename, const char
*linktarget,size_t size, mode_t mode, uid_t uid, gid_t gid, time_t mtime);
 
+extern enum tarError tarHeaderRename(char *h, const char *filename);extern int    tarChecksum(char *header);
+extern size_t tarHeaderGetName(const char *h, char *buf, size_t buflen);
diff --git a/src/port/tar.c b/src/port/tar.c
index 72fd4e1..23f0201 100644
--- a/src/port/tar.c
+++ b/src/port/tar.c
@@ -45,6 +45,122 @@ tarChecksum(char *header)/*
+ * Split a file path for use in a tar header. The return value is the second
+ * half of the path if a split is required and possible, NULL otherwise.
+ */
+static const char *
+tarSplitName(const char *filename, size_t len)
+{
+    const char *sep;
+
+    if (len <= 99)
+        sep = NULL;
+    else if ((sep = strchr(&filename[len-99], '/')) != NULL)
+        sep++;
+
+    return NULL;
+}
+
+
+/*
+ * Fill in the name and prefix fields of the tar format header pointed to by
+ * h. This should only be used when initially filling out the header.
+ */
+static enum tarError
+tarHeaderSetName(char *h, const char *filename, bool isdir, bool longname)
+{
+    const char *prefix, *base;
+    size_t len, baselen, prelen;
+
+    len = strlen(filename);
+    if (longname)
+        base = tarSplitName(filename, len);
+    else
+        base = NULL;
+    if (base == NULL)
+    {
+        prefix = "";
+        prelen = 0;
+        base = filename;
+        baselen = len;
+    }
+    else
+    {
+        prefix = filename;
+        prelen = (base - filename) - 1;
+        baselen = len - (base - filename);
+    }
+
+    /*
+     * Save room for a trailing slash if this is a directory or symlink to a
+     * directory
+     */
+    if (isdir && base[baselen-1] != '/')
+        baselen++;
+
+    if (baselen > 99 || prelen > 154)
+        return TAR_NAME_TOO_LONG;
+
+    memcpy(&h[345], prefix, prelen);
+    memset(&h[345+prelen], 0, 155 - prelen);
+    memcpy(&h[0], base, baselen);
+    memset(&h[0+baselen], 0, 100 - baselen);
+
+    /*
+     * We only support symbolic links to directories, and this is
+     * indicated in the tar format by adding a slash at the end of the
+     * name, the same as for regular directories.
+     */
+    if (isdir && base[baselen-1] != '/')
+        h[0+baselen-1] = '/';
+
+    return TAR_OK;
+}
+
+
+/*
+ * Wrapper around tarHeaderSetName() to be used when changing the name in an
+ * existing header.
+ */
+enum tarError
+tarHeaderRename(char *h, const char *filename)
+{
+    size_t            len;
+    bool            isdir;
+    enum tarError    err;
+
+    /* If the existing name ends with a slash then this must be preserved */
+    len = strnlen(h, 100);
+    isdir = (len > 0 && h[len-1] == '/');
+
+    err = tarHeaderSetName(h, filename, isdir, true);
+
+    if (err == TAR_OK)
+        /* Recalculate checksum as per tarCreateHeader() */
+        sprintf(&h[148], "%06o ", tarChecksum(h));
+
+    return err;
+}
+
+
+/*
+ * Copy the full pathname from the tar header pointed to by h into
+ * buf. Returns the total length of the path, if this is >= len then the path
+ * has been truncated.
+ */
+size_t
+tarHeaderGetName(const char *h, char *buf, size_t buflen)
+{
+    strlcpy(buf, &h[345], buflen);
+    if (buflen && buf[0] != '\0')
+        strlcat(buf, "/", buflen);
+    strlcat(buf, &h[0], buflen);
+
+    return (strlen(&h[345]) + 1 + strlen(&h[0]));
+}
+
+
+/* * Fill in the buffer pointed to by h with a tar format header. This buffer * must always have space for 512
characters,which is a requirement by * the tar format.
 
@@ -68,20 +184,8 @@ tarCreateHeader(char *h, const char *filename, const char *linktarget,    memset(h, 0, 512);
   /* assume tar header size */    /* Name 100 */
 
-    strlcpy(&h[0], filename, 100);
-    if (linktarget != NULL || S_ISDIR(mode))
-    {
-        /*
-         * We only support symbolic links to directories, and this is
-         * indicated in the tar format by adding a slash at the end of the
-         * name, the same as for regular directories.
-         */
-        int            flen = strlen(filename);
-
-        flen = Min(flen, 99);
-        h[flen] = '/';
-        h[flen + 1] = '\0';
-    }
+    tarHeaderSetName(h, filename,
+      (linktarget != NULL || S_ISDIR(mode)), false);    /* Mode 8 - this doesn't include the file type bits (S_IFMT)
*/   sprintf(&h[100], "%07o ", (int) (mode & 07777));
 
@@ -139,7 +243,7 @@ tarCreateHeader(char *h, const char *filename, const char *linktarget,    /* Minor Dev 8 */
sprintf(&h[337],"%07o ", 0);
 
-    /* Prefix 155 - not used, leave as nulls */
+    /* Prefix 155 - not currently used, leave as nulls */    /*     * We mustn't overwrite the next field while
insertingthe checksum.
 
-- 
2.3.0




Re: Add pg_basebackup single tar output format

From
Joshua Elsasser
Date:
On Tue, Sep 29, 2015 at 11:50:37PM +0200, Andres Freund wrote:
> On 2015-09-29 14:38:11 -0700, Josh Elsasser wrote:
> > I've put my changes up as a series of relatively small commits on this
> > branch of a github fork:
> > 
> > https://github.com/jre/postgres/commits/single-tar
> > 
> > Comments and suggestions are welcome.
> 
> Please post them to the list, we want patches to be in our archives.

Sure, let's see if I can figure out git send-email




[PATCH 3/6] pg_basebackup: factor out code to add a recovery.conf file to the tar file.

From
Joshua Elsasser
Date:
This is just a simple refactor for readability and reusability.
---src/bin/pg_basebackup/pg_basebackup.c | 46 ++++++++++++++++++++---------------1 file changed, 26 insertions(+), 20
deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa942ab..f73dd38 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -111,6 +111,7 @@ static void progress_report(int tablespacenum, const char *filename, bool force)static void
OpenTarFile(TarStream*tarfile, const char *path);static void CloseTarFile(TarStream *tarfile);
 
+static void TarInsertRecoveryConf(TarStream *stream);static void ReceiveTarFile(PGconn *conn, PGresult *res, int
rownum);staticvoid ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);static void
GenerateRecoveryConf(PGconn*conn);
 
@@ -874,6 +875,30 @@ CloseTarFile(TarStream *tarfile)/*
+ * Write a recovery.conf file into the tar stream.
+ */
+static void
+TarInsertRecoveryConf(TarStream *stream)
+{
+    static char    zerobuf[512];
+    char        header[512];
+    int            padding;
+
+    tarCreateHeader(header, "recovery.conf", NULL,
+                    recoveryconfcontents->len,
+                    0600, 04000, 02000,
+                    time(NULL));
+
+    padding = ((recoveryconfcontents->len + 511) & ~511) - recoveryconfcontents->len;
+
+    writeTarData(stream, header, sizeof(header));
+    writeTarData(stream, recoveryconfcontents->data, recoveryconfcontents->len);
+    if (padding)
+        writeTarData(stream, zerobuf, padding);
+}
+
+
+/* * Open a (possibly zlib-compressed) tar file for writing. The filebase * argument should be the desired filename
relativeto basedir, without a .tar * or .tar.gz file extension. If the user specified a basedir of - then stdout
 
@@ -957,27 +982,8 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)             * Also, write two completely
emptyblocks at the end of the tar             * file, as required by some tar programs.             */
 
-            char        zerobuf[1024];
-
-            MemSet(zerobuf, 0, sizeof(zerobuf));
-            if (basetablespace && writerecoveryconf)
-            {
-                char        header[512];
-                int            padding;
-
-                tarCreateHeader(header, "recovery.conf", NULL,
-                                recoveryconfcontents->len,
-                                0600, 04000, 02000,
-                                time(NULL));
-
-                padding = ((recoveryconfcontents->len + 511) & ~511) - recoveryconfcontents->len;
-
-                writeTarData(&stream, header, sizeof(header));
-                writeTarData(&stream, recoveryconfcontents->data, recoveryconfcontents->len);
-                if (padding)
-                    writeTarData(&stream, zerobuf, padding);
-            }
+                TarInsertRecoveryConf(&stream);            CloseTarFile(&stream);            break;
-- 
2.3.0




[PATCH 2/6] pg_basebackup: factor out tar open/close code for reusability.

From
Joshua Elsasser
Date:
This adds a simple struct and open and close functions to abstract and
isolate the zlib vs. stdio output logic and allow it to be reused.
---src/bin/pg_basebackup/pg_basebackup.c | 300 +++++++++++++++++-----------------1 file changed, 154 insertions(+), 146
deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 80de882..fa942ab 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -51,6 +51,15 @@ typedef struct TablespaceList    TablespaceListCell *tail;} TablespaceList;
+typedef struct TarStream {
+    char         path[MAXPGPATH];
+    FILE        *file;
+#ifdef HAVE_LIBZ
+    gzFile         zfile;
+#endif
+    bool         keepopen;
+} TarStream;
+/* Global options */static char *basedir = NULL;static TablespaceList tablespace_dirs = {NULL, NULL};
@@ -100,6 +109,8 @@ static void disconnect_and_exit(int code);static void verify_dir_is_empty_or_create(char
*dirname);staticvoid progress_report(int tablespacenum, const char *filename, bool force);
 
+static void OpenTarFile(TarStream *tarfile, const char *path);
+static void CloseTarFile(TarStream *tarfile);static void ReceiveTarFile(PGconn *conn, PGresult *res, int
rownum);staticvoid ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);static void
GenerateRecoveryConf(PGconn*conn);
 
@@ -725,169 +736,194 @@ parse_max_rate(char *src) * Write a piece of tar data */static void
-writeTarData(
-#ifdef HAVE_LIBZ
-             gzFile ztarfile,
-#endif
-             FILE *tarfile, char *buf, int r, char *current_file)
+writeTarData(TarStream *stream, char *buf, int r){#ifdef HAVE_LIBZ
-    if (ztarfile != NULL)
+    if (stream->zfile != NULL)    {
-        if (gzwrite(ztarfile, buf, r) != r)
+        if (gzwrite(stream->zfile, buf, r) != r)        {            fprintf(stderr,                    _("%s: could
notwrite to compressed file \"%s\": %s\n"),
 
-                    progname, current_file, get_gz_error(ztarfile));
+                    progname, stream->path, get_gz_error(stream->zfile));            disconnect_and_exit(1);        }
 }    else#endif    {
 
-        if (fwrite(buf, r, 1, tarfile) != 1)
+        if (fwrite(buf, r, 1, stream->file) != 1)        {            fprintf(stderr, _("%s: could not write to file
\"%s\":%s\n"),
 
-                    progname, current_file, strerror(errno));
+                    progname, stream->path, strerror(errno));            disconnect_and_exit(1);        }    }}
-#ifdef HAVE_LIBZ
-#define WRITE_TAR_DATA(buf, sz) writeTarData(ztarfile, tarfile, buf, sz, filename)
-#else
-#define WRITE_TAR_DATA(buf, sz) writeTarData(tarfile, buf, sz, filename)
-#endif/*
- * Receive a tar format file from the connection to the server, and write
- * the data from this file directly into a tar file. If compression is
- * enabled, the data will be compressed while written to the file.
- *
- * The file will be named base.tar[.gz] if it's for the main data directory
- * or <tablespaceoid>.tar[.gz] if it's for another tablespace.
- *
- * No attempt to inspect or validate the contents of the file is done.
+ * Open a (possibly zlib-compressed) tar file for writing.  A filename of -
+ * will cause stdout to be used. */static void
-ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)
+OpenTarFile(TarStream *tarfile, const char *path){
-    char        filename[MAXPGPATH];
-    char       *copybuf = NULL;
-    FILE       *tarfile = NULL;
-    char        tarhdr[512];
-    bool        basetablespace = PQgetisnull(res, rownum, 0);
-    bool        in_tarhdr = true;
-    bool        skip_file = false;
-    size_t        tarhdrsz = 0;
-    size_t        filesz = 0;
+    bool        use_stdout;
-#ifdef HAVE_LIBZ
-    gzFile        ztarfile = NULL;
-#endif
+    MemSet(tarfile, 0, sizeof(*tarfile));
+    use_stdout = (strcmp(path, "-") == 0);
+    strlcpy(tarfile->path, path, sizeof(tarfile->path));
-    if (basetablespace)
+#ifdef HAVE_LIBZ
+    if (compresslevel != 0)    {
-        /*
-         * Base tablespaces
-         */
-        if (strcmp(basedir, "-") == 0)
+        /* Compression is in use */
+        if (use_stdout)
+            tarfile->zfile = gzdopen(dup(fileno(stdout)), "wb");
+        else
+            tarfile->zfile = gzopen(tarfile->path, "wb");
+        if (!tarfile->zfile)        {
-#ifdef HAVE_LIBZ
-            if (compresslevel != 0)
-            {
-                ztarfile = gzdopen(dup(fileno(stdout)), "wb");
-                if (gzsetparams(ztarfile, compresslevel,
-                                Z_DEFAULT_STRATEGY) != Z_OK)
-                {
-                    fprintf(stderr,
-                            _("%s: could not set compression level %d: %s\n"),
-                            progname, compresslevel, get_gz_error(ztarfile));
-                    disconnect_and_exit(1);
-                }
-            }
-            else
-#endif
-                tarfile = stdout;
-            strcpy(filename, "-");
+            fprintf(stderr,
+                    _("%s: could not create compressed file \"%s\": %s\n"),
+                    progname, tarfile->path, strerror(errno));
+            disconnect_and_exit(1);        }
-        else
+        if (gzsetparams(tarfile->zfile, compresslevel,
+                        Z_DEFAULT_STRATEGY) != Z_OK)        {
-#ifdef HAVE_LIBZ
-            if (compresslevel != 0)
-            {
-                snprintf(filename, sizeof(filename), "%s/base.tar.gz", basedir);
-                ztarfile = gzopen(filename, "wb");
-                if (gzsetparams(ztarfile, compresslevel,
-                                Z_DEFAULT_STRATEGY) != Z_OK)
-                {
-                    fprintf(stderr,
-                            _("%s: could not set compression level %d: %s\n"),
-                            progname, compresslevel, get_gz_error(ztarfile));
-                    disconnect_and_exit(1);
-                }
-            }
-            else
-#endif
-            {
-                snprintf(filename, sizeof(filename), "%s/base.tar", basedir);
-                tarfile = fopen(filename, "wb");
-            }
+            fprintf(stderr,
+                    _("%s: could not set compression level %d: %s\n"),
+                    progname, compresslevel, get_gz_error(tarfile->zfile));
+            disconnect_and_exit(1);        }    }
+    else
+#endif    {
-        /*
-         * Specific tablespace
-         */
-#ifdef HAVE_LIBZ
-        if (compresslevel != 0)
-        {
-            snprintf(filename, sizeof(filename), "%s/%s.tar.gz", basedir,
-                     PQgetvalue(res, rownum, 0));
-            ztarfile = gzopen(filename, "wb");
-            if (gzsetparams(ztarfile, compresslevel,
-                            Z_DEFAULT_STRATEGY) != Z_OK)
-            {
-                fprintf(stderr,
-                        _("%s: could not set compression level %d: %s\n"),
-                        progname, compresslevel, get_gz_error(ztarfile));
-                disconnect_and_exit(1);
-            }
+        /* Either no zlib support, or zlib support but compresslevel = 0 */
+        if (use_stdout) {
+            tarfile->file = stdout;
+            tarfile->keepopen = true;        }        else
-#endif
+            tarfile->file = fopen(tarfile->path, "wb");
+        if (!tarfile->file)
+        {
+            fprintf(stderr, _("%s: could not create file \"%s\": %s\n"),
+                    progname, tarfile->path, strerror(errno));
+            disconnect_and_exit(1);
+        }
+    }
+}
+
+
+/*
+ * Close a tar file opened by OpenTarFile. The end-of-archive marker of two
+ * zero blocks will be written first.
+ */
+static void
+CloseTarFile(TarStream *tarfile)
+{
+    /* 2 * 512 bytes empty data */
+    static char        zerobuf[1024];
+
+    /*
+     * Write two completely empty blocks at the end of the tar file, as
+     * required by some tar programs.
+     */
+    writeTarData(tarfile, zerobuf, sizeof(zerobuf));
+
+    if (tarfile->keepopen)
+    {
+        if (fflush(tarfile->file) != 0)        {
-            snprintf(filename, sizeof(filename), "%s/%s.tar", basedir,
-                     PQgetvalue(res, rownum, 0));
-            tarfile = fopen(filename, "wb");
+            fprintf(stderr,
+                    _("%s: could not write to file \"%s\": %s\n"),
+                    progname, tarfile->path, strerror(errno));
+            disconnect_and_exit(1);        }
+        return;    }#ifdef HAVE_LIBZ
-    if (compresslevel != 0)
+    if (tarfile->zfile != NULL)    {
-        if (!ztarfile)
+        if (gzclose(tarfile->zfile) != 0)        {
-            /* Compression is in use */            fprintf(stderr,
-                    _("%s: could not create compressed file \"%s\": %s\n"),
-                    progname, filename, get_gz_error(ztarfile));
+                    _("%s: could not close compressed file \"%s\": %s\n"),
+                    progname, tarfile->path, get_gz_error(tarfile->zfile));            disconnect_and_exit(1);
}   }    else#endif    {
 
-        /* Either no zlib support, or zlib support but compresslevel = 0 */
-        if (!tarfile)
+        if (fclose(tarfile->file) != 0)        {
-            fprintf(stderr, _("%s: could not create file \"%s\": %s\n"),
-                    progname, filename, strerror(errno));
+            fprintf(stderr,
+                    _("%s: could not close file \"%s\": %s\n"),
+                    progname, tarfile->path, strerror(errno));            disconnect_and_exit(1);        }    }
+}
+
+
+/*
+ * Open a (possibly zlib-compressed) tar file for writing. The filebase
+ * argument should be the desired filename relative to basedir, without a .tar
+ * or .tar.gz file extension. If the user specified a basedir of - then stdout
+ * will be used.
+ */
+static void
+openRelativeTarFile(TarStream *tarfile, const char *filebase)
+{
+    char        path[MAXPGPATH];
+
+    if (strcmp(basedir, "-") == 0)
+        strlcpy(path, "-", sizeof(path));
+#ifdef HAVE_LIBZ
+    else if (compresslevel != 0)
+        snprintf(path, sizeof(path), "%s/%s.tar.gz", basedir, filebase);
+#endif
+    else
+        snprintf(path, sizeof(path), "%s/%s.tar", basedir, filebase);
+    OpenTarFile(tarfile, path);
+}
+
+
+/*
+ * Receive a tar format file from the connection to the server, and write
+ * the data from this file directly into a tar file. If compression is
+ * enabled, the data will be compressed while written to the file.
+ *
+ * The file will be named base.tar[.gz] if it's for the main data directory
+ * or <tablespaceoid>.tar[.gz] if it's for another tablespace.
+ *
+ * No attempt to inspect or validate the contents of the file is done.
+ */
+static void
+ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)
+{
+    TarStream    stream;
+    char       *copybuf = NULL;
+    char        tarhdr[512];
+    bool        basetablespace = PQgetisnull(res, rownum, 0);
+    bool        in_tarhdr = true;
+    bool        skip_file = false;
+    size_t        tarhdrsz = 0;
+    size_t        filesz = 0;
+
+    if (basetablespace)
+        /* Base tablespace */
+        openRelativeTarFile(&stream, "base");
+    else
+        /* Specific tablespace */
+        openRelativeTarFile(&stream, PQgetvalue(res, rownum, 0));    /*     * Get the COPY data stream
@@ -937,41 +973,13 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)                padding =
((recoveryconfcontents->len+ 511) & ~511) - recoveryconfcontents->len;
 
-                WRITE_TAR_DATA(header, sizeof(header));
-                WRITE_TAR_DATA(recoveryconfcontents->data, recoveryconfcontents->len);
+                writeTarData(&stream, header, sizeof(header));
+                writeTarData(&stream, recoveryconfcontents->data, recoveryconfcontents->len);                if
(padding)
-                    WRITE_TAR_DATA(zerobuf, padding);
-            }
-
-            /* 2 * 512 bytes empty data at end of file */
-            WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
-
-#ifdef HAVE_LIBZ
-            if (ztarfile != NULL)
-            {
-                if (gzclose(ztarfile) != 0)
-                {
-                    fprintf(stderr,
-                       _("%s: could not close compressed file \"%s\": %s\n"),
-                            progname, filename, get_gz_error(ztarfile));
-                    disconnect_and_exit(1);
-                }
-            }
-            else
-#endif
-            {
-                if (strcmp(basedir, "-") != 0)
-                {
-                    if (fclose(tarfile) != 0)
-                    {
-                        fprintf(stderr,
-                                _("%s: could not close file \"%s\": %s\n"),
-                                progname, filename, strerror(errno));
-                        disconnect_and_exit(1);
-                    }
-                }
+                    writeTarData(&stream, zerobuf, padding);            }
+            CloseTarFile(&stream);            break;        }        else if (r == -2)
@@ -988,7 +996,7 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)             * tablespace, we never have to
lookfor an existing recovery.conf             * file in the stream.             */
 
-            WRITE_TAR_DATA(copybuf, r);
+            writeTarData(&stream, copybuf, r);        }        else        {
@@ -1059,7 +1067,7 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)                         * header
unmodified.                        */                        if (!skip_file)
 
-                            WRITE_TAR_DATA(tarhdr, 512);
+                            writeTarData(&stream, tarhdr, 512);                    }                }
else
@@ -1077,7 +1085,7 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)                        bytes2write =
(filesz> rr ? rr : filesz);                        if (!skip_file)
 
-                            WRITE_TAR_DATA(copybuf + pos, bytes2write);
+                            writeTarData(&stream, copybuf + pos, bytes2write);                        rr -=
bytes2write;                       pos += bytes2write;
 
@@ -1098,9 +1106,9 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)            }        }        totaldone +=
r;
-        progress_report(rownum, filename, false);
+        progress_report(rownum, stream.path, false);    }                            /* while (1) */
-    progress_report(rownum, filename, true);
+    progress_report(rownum, stream.path, true);    if (copybuf != NULL)        PQfreemem(copybuf);
-- 
2.3.0




[PATCH 5/6] pg_basebackup: allow GetConnection() to make non-replication connections.

From
Joshua Elsasser
Date:
---src/bin/pg_basebackup/pg_basebackup.c  | 4 ++--src/bin/pg_basebackup/pg_receivexlog.c | 4
++--src/bin/pg_basebackup/pg_recvlogical.c| 4 ++--src/bin/pg_basebackup/streamutil.c     | 6
+++---src/bin/pg_basebackup/streamutil.h    | 2 +-5 files changed, 10 insertions(+), 10 deletions(-)
 

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index ccd0890..e29e466 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -454,7 +454,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)#endif    /* Get a second
connection*/
 
-    param->bgconn = GetConnection();
+    param->bgconn = GetConnection(true);    if (!param->bgconn)        /* Error message already written in
GetConnection()*/        exit(1);
 
@@ -1652,7 +1652,7 @@ BaseBackup(void)    /*     * Connect in replication mode to the server     */
-    conn = GetConnection();
+    conn = GetConnection(true);    if (!conn)        /* Error message already written in GetConnection() */
exit(1);
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index 0c322d1..3c61372 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -285,7 +285,7 @@ StreamLog(void)     * Connect in replication mode to the server     */    if (conn == NULL)
-        conn = GetConnection();
+        conn = GetConnection(true);    if (!conn)        /* Error message already written in GetConnection() */
return;
@@ -533,7 +533,7 @@ main(int argc, char **argv)    /*     * Obtain a connection before doing anything.     */
-    conn = GetConnection();
+    conn = GetConnection(true);    if (!conn)        /* error message already written in GetConnection() */
exit(1);
diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c
index 93f61c3..faf7cbf 100644
--- a/src/bin/pg_basebackup/pg_recvlogical.c
+++ b/src/bin/pg_basebackup/pg_recvlogical.c
@@ -216,7 +216,7 @@ StreamLogicalLog(void)     * Connect in replication mode to the server     */    if (!conn)
-        conn = GetConnection();
+        conn = GetConnection(true);    if (!conn)        /* Error message already written in GetConnection() */
return;
@@ -856,7 +856,7 @@ main(int argc, char **argv)     * helps to get more precise error messages about authentification,
  * required GUC parameters and such.     */
 
-    conn = GetConnection();
+    conn = GetConnection(true);    if (!conn)        /* Error message already written in GetConnection() */
exit(1);
diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
index 2c963b6..74cfb5b 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -50,7 +50,7 @@ PGconn       *conn = NULL; * call exit(1) directly. */PGconn *
-GetConnection(void)
+GetConnection(bool replication){    PGconn       *tmpconn;    int            argcount = 7;    /* dbname, replication,
fallback_app_name,
@@ -104,10 +104,10 @@ GetConnection(void)    }    keywords[i] = "dbname";
-    values[i] = dbname == NULL ? "replication" : dbname;
+    values[i] = dbname == NULL ? (replication ? "replication" : "postgres") : dbname;    i++;    keywords[i] =
"replication";
-    values[i] = dbname == NULL ? "true" : "database";
+    values[i] = replication ? (dbname == NULL ? "true" : "database") : "false";    i++;    keywords[i] =
"fallback_application_name";   values[i] = progname;
 
diff --git a/src/bin/pg_basebackup/streamutil.h b/src/bin/pg_basebackup/streamutil.h
index b95f83f..21a6331 100644
--- a/src/bin/pg_basebackup/streamutil.h
+++ b/src/bin/pg_basebackup/streamutil.h
@@ -28,7 +28,7 @@ extern char *replication_slot;/* Connection kept global so we can disconnect easily */extern PGconn
*conn;
-extern PGconn *GetConnection(void);
+extern PGconn *GetConnection(bool replication);/* Replication commands */extern bool CreateReplicationSlot(PGconn
*conn,const char *slot_name,
 
-- 
2.3.0




[PATCH 4/6] pg_basebackup: don't lose a zero-length file at the end of a tablespace.

From
Joshua Elsasser
Date:
After a complete tar header was buffered it would only be processed
during the next iteration of the read loop. A zero-length file such as
a directory had no data to read, so the loop would exit without ever
having processed the file.
---src/bin/pg_basebackup/pg_basebackup.c | 238 +++++++++++++++++++---------------1 file changed, 134 insertions(+), 104
deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index f73dd38..ccd0890 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -60,6 +60,15 @@ typedef struct TarStream {    bool         keepopen;} TarStream;
+typedef struct TarParser {
+    char        tarhdr[512];
+    TarStream    *stream;
+    bool        in_tarhdr;
+    bool        skip_file;
+    size_t        tarhdrsz;
+    size_t        filesz;
+} TarParser;
+/* Global options */static char *basedir = NULL;static TablespaceList tablespace_dirs = {NULL, NULL};
@@ -112,6 +121,10 @@ static void progress_report(int tablespacenum, const char *filename, bool force)static void
OpenTarFile(TarStream*tarfile, const char *path);static void CloseTarFile(TarStream *tarfile);static void
TarInsertRecoveryConf(TarStream*stream);
 
+static void IterateAndWriteTar(TarParser *tp, char *inbuf, int buflen,
+                               bool (*callback)(char *, void *), void *cbarg);
+static bool TarIterSkipRecoveryConf(char *h, void *arg);
+static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);static void ReceiveAndUnpackTarFile(PGconn *conn,
PGresult*res, int rownum);static void GenerateRecoveryConf(PGconn *conn);
 
@@ -899,6 +912,120 @@ TarInsertRecoveryConf(TarStream *stream)/*
+ * Process the individual files inside the TAR stream and pass their headers
+ * to a callback which can modify or chose to skip them. The stream consists
+ * of a header and zero or more chunks, all 512 bytes long. The stream from
+ * the server is broken up into smaller pieces, so we have to track the size
+ * of the files to find the next header structure.
+ */
+static void
+IterateAndWriteTar(TarParser *tp, char *inbuf, int buflen,
+                   bool (*callback)(char *, void *), void *cbarg)
+{
+    int            rr = buflen;
+    int            pos = 0;
+
+    while (rr > 0)
+    {
+        if (tp->in_tarhdr)
+        {
+            /*
+             * We're currently reading a header structure inside the
+             * TAR stream, i.e. the file metadata.
+             */
+            if (tp->tarhdrsz < 512)
+            {
+                /*
+                 * Copy the header structure into tarhdr in case the
+                 * header is not aligned to 512 bytes or it's not
+                 * returned in whole by the last PQgetCopyData call.
+                 */
+                int            hdrleft;
+                int            bytes2copy;
+
+                hdrleft = 512 - tp->tarhdrsz;
+                bytes2copy = (rr > hdrleft ? hdrleft : rr);
+
+                memcpy(&tp->tarhdr[tp->tarhdrsz], inbuf + pos, bytes2copy);
+
+                rr -= bytes2copy;
+                pos += bytes2copy;
+                tp->tarhdrsz += bytes2copy;
+            }
+
+            if (tp->tarhdrsz == 512)
+            {
+                /*
+                 * We have the complete header structure in tarhdr, let the
+                 * callback possibly modify it or chose to skip the file. Find
+                 * out the size of the file padded to the next multiple of 512
+                 */
+                int            padding;
+
+                tp->skip_file = callback(tp->tarhdr, cbarg);
+
+                sscanf(&tp->tarhdr[124], "%11o", (unsigned int *) &tp->filesz);
+
+                padding = ((tp->filesz + 511) & ~511) - tp->filesz;
+                tp->filesz += padding;
+
+                /* Next part is the file, not the header */
+                tp->in_tarhdr = false;
+
+                /*
+                 * If we're not skipping the file, write the tar
+                 * header unmodified.
+                 */
+                if (!tp->skip_file)
+                    writeTarData(tp->stream, tp->tarhdr, 512);
+            }
+        }
+        else
+        {
+            /*
+             * We're processing a file's contents.
+             */
+            if (tp->filesz > 0)
+            {
+                /*
+                 * We still have data to read (and possibly write).
+                 */
+                int            bytes2write;
+
+                bytes2write = (tp->filesz > rr ? rr : tp->filesz);
+
+                if (!tp->skip_file)
+                    writeTarData(tp->stream, inbuf + pos, bytes2write);
+
+                rr -= bytes2write;
+                pos += bytes2write;
+                tp->filesz -= bytes2write;
+            }
+            else
+            {
+                /*
+                 * No more data in the current file, the next piece of
+                 * data (if any) will be a new file header structure.
+                 */
+                tp->in_tarhdr = true;
+                tp->skip_file = false;
+                tp->tarhdrsz = 0;
+                tp->filesz = 0;
+            }
+        }
+    }
+}
+
+
+static bool
+TarIterSkipRecoveryConf(char *h, void *arg)
+{
+    /* Skip the file if the name is recovery.conf */
+    return (strcmp(&h[0], "recovery.conf") == 0);
+}
+
+
+/* * Open a (possibly zlib-compressed) tar file for writing. The filebase * argument should be the desired filename
relativeto basedir, without a .tar * or .tar.gz file extension. If the user specified a basedir of - then stdout
 
@@ -936,12 +1063,12 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum){    TarStream    stream;    char
*copybuf= NULL;
 
-    char        tarhdr[512];    bool        basetablespace = PQgetisnull(res, rownum, 0);
-    bool        in_tarhdr = true;
-    bool        skip_file = false;
-    size_t        tarhdrsz = 0;
-    size_t        filesz = 0;
+    TarParser    parser;
+
+    MemSet(&parser, 0, sizeof(parser));
+    parser.stream = &stream;
+    parser.in_tarhdr = true;    if (basetablespace)        /* Base tablespace */
@@ -1010,106 +1137,9 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)             * Look for a recovery.conf
inthe existing tar stream. If it's             * there, we must skip it so we can later overwrite it with our
 * own version of the file.
 
-             *
-             * To do this, we have to process the individual files inside the
-             * TAR stream. The stream consists of a header and zero or more
-             * chunks, all 512 bytes long. The stream from the server is
-             * broken up into smaller pieces, so we have to track the size of
-             * the files to find the next header structure.             */
-            int            rr = r;
-            int            pos = 0;
-
-            while (rr > 0)
-            {
-                if (in_tarhdr)
-                {
-                    /*
-                     * We're currently reading a header structure inside the
-                     * TAR stream, i.e. the file metadata.
-                     */
-                    if (tarhdrsz < 512)
-                    {
-                        /*
-                         * Copy the header structure into tarhdr in case the
-                         * header is not aligned to 512 bytes or it's not
-                         * returned in whole by the last PQgetCopyData call.
-                         */
-                        int            hdrleft;
-                        int            bytes2copy;
-
-                        hdrleft = 512 - tarhdrsz;
-                        bytes2copy = (rr > hdrleft ? hdrleft : rr);
-
-                        memcpy(&tarhdr[tarhdrsz], copybuf + pos, bytes2copy);
-
-                        rr -= bytes2copy;
-                        pos += bytes2copy;
-                        tarhdrsz += bytes2copy;
-                    }
-                    else
-                    {
-                        /*
-                         * We have the complete header structure in tarhdr,
-                         * look at the file metadata: - the subsequent file
-                         * contents have to be skipped if the filename is
-                         * recovery.conf - find out the size of the file
-                         * padded to the next multiple of 512
-                         */
-                        int            padding;
-
-                        skip_file = (strcmp(&tarhdr[0], "recovery.conf") == 0);
-
-                        sscanf(&tarhdr[124], "%11o", (unsigned int *) &filesz);
-
-                        padding = ((filesz + 511) & ~511) - filesz;
-                        filesz += padding;
-
-                        /* Next part is the file, not the header */
-                        in_tarhdr = false;
-
-                        /*
-                         * If we're not skipping the file, write the tar
-                         * header unmodified.
-                         */
-                        if (!skip_file)
-                            writeTarData(&stream, tarhdr, 512);
-                    }
-                }
-                else
-                {
-                    /*
-                     * We're processing a file's contents.
-                     */
-                    if (filesz > 0)
-                    {
-                        /*
-                         * We still have data to read (and possibly write).
-                         */
-                        int            bytes2write;
-
-                        bytes2write = (filesz > rr ? rr : filesz);
-
-                        if (!skip_file)
-                            writeTarData(&stream, copybuf + pos, bytes2write);
-
-                        rr -= bytes2write;
-                        pos += bytes2write;
-                        filesz -= bytes2write;
-                    }
-                    else
-                    {
-                        /*
-                         * No more data in the current file, the next piece of
-                         * data (if any) will be a new file header structure.
-                         */
-                        in_tarhdr = true;
-                        skip_file = false;
-                        tarhdrsz = 0;
-                        filesz = 0;
-                    }
-                }
-            }
+            IterateAndWriteTar(&parser, copybuf, r,
+                               TarIterSkipRecoveryConf, NULL);        }        totaldone += r;
progress_report(rownum,stream.path, false);
 
-- 
2.3.0




[PATCH 6/6] pg_basebackup: add a single-tar output format.

From
Joshua Elsasser
Date:
This will write one single tar file containing all tablespaces, and
can be written to stdout.
---src/bin/pg_basebackup/pg_basebackup.c | 282 ++++++++++++++++++++++++++++++++--1 file changed, 269 insertions(+), 13
deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index e29e466..b3534cb 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -29,6 +29,7 @@#include "getopt_long.h"#include "libpq-fe.h"#include "pqexpbuffer.h"
+#include "common/fe_memutils.h"#include "pgtar.h"#include "pgtime.h"#include "receivelog.h"
@@ -71,7 +72,9 @@ typedef struct TarParser {/* Global options */static char *basedir = NULL;
+static char *outpath = NULL;static TablespaceList tablespace_dirs = {NULL, NULL};
+static char *default_basetablespace = NULL;static char *xlog_dir = "";static char format = 'p';        /*
p(lain)/t(ar)*/static char *label = "pg_basebackup base backup";
 
@@ -117,15 +120,19 @@ static void usage(void);static void disconnect_and_exit(int code);static void
verify_dir_is_empty_or_create(char*dirname);static void progress_report(int tablespacenum, const char *filename, bool
force);
+static char *get_base_tablespace_path(void);static void OpenTarFile(TarStream *tarfile, const char *path);static void
CloseTarFile(TarStream*tarfile);
 
-static void TarInsertRecoveryConf(TarStream *stream);
+static void TarInsertRecoveryConf(TarStream *stream, const char *prefix);
+static void TarInsertDirectory(TarStream *stream, const char *path, mode_t mode);static void
IterateAndWriteTar(TarParser*tp, char *inbuf, int buflen,                               bool (*callback)(char *, void
*),void *cbarg);static bool TarIterSkipRecoveryConf(char *h, void *arg);
 
+static bool TarIterRenameForTablespace(char *h, void *arg);static void ReceiveTarFile(PGconn *conn, PGresult *res, int
rownum);
+static void ReceiveAndAppendTarFile(TarStream *tarfile, PGconn *conn, PGresult *res, int rownum);static void
ReceiveAndUnpackTarFile(PGconn*conn, PGresult *res, int rownum);static void GenerateRecoveryConf(PGconn *conn);static
voidWriteRecoveryConf(void);
 
@@ -259,7 +266,7 @@ usage(void)    printf(_("  %s [OPTION]...\n"), progname);    printf(_("\nOptions controlling the
output:\n"));   printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
 
-    printf(_("  -F, --format=p|t       output format (plain (default), tar)\n"));
+    printf(_("  -F, --format=p|t|s     output format (plain (default), tar, single-tar)\n"));    printf(_("  -r,
--max-rate=RATE   maximum transfer rate to transfer data directory\n"      "                         (in kB/s, or use
suffix\"k\" or \"M\")\n"));    printf(_("  -R, --write-recovery-conf\n"
 
@@ -746,6 +753,39 @@ parse_max_rate(char *src)    return (int32) result;}
+
+/*
+ * Returns the path of the server's data directory. The returned string is
+ * malloc'd.
+ */
+static char *
+get_base_tablespace_path(void)
+{
+    PGconn *sqlconn;
+    PGresult *res;
+    char *dir;
+
+    sqlconn = GetConnection(false);
+    if (!sqlconn)
+        /* Error message already written in GetConnection() */
+        disconnect_and_exit(1);
+
+    res = PQexec(sqlconn, "SELECT setting FROM pg_catalog.pg_settings "
+                 "WHERE name = 'data_directory';");
+    if (PQresultStatus(res) != PGRES_TUPLES_OK)
+    {
+        fprintf(stderr, _("%s: could not get server data_directory: %s"),
+                progname, PQerrorMessage(sqlconn));
+        disconnect_and_exit(1);
+    }
+
+    dir = pg_strdup(PQgetvalue(res, 0, 0));
+    PQclear(res);
+    PQfinish(sqlconn);
+    return dir;
+}
+
+/* * Write a piece of tar data */
@@ -891,7 +931,7 @@ CloseTarFile(TarStream *tarfile) * Write a recovery.conf file into the tar stream. */static void
-TarInsertRecoveryConf(TarStream *stream)
+TarInsertRecoveryConf(TarStream *stream, const char *prefix){    static char    zerobuf[512];    char
header[512];
@@ -901,6 +941,8 @@ TarInsertRecoveryConf(TarStream *stream)                    recoveryconfcontents->len,
     0600, 04000, 02000,                    time(NULL));
 
+    if (prefix != NULL)
+        TarIterRenameForTablespace(header, (void *)prefix);    padding = ((recoveryconfcontents->len + 511) & ~511) -
recoveryconfcontents->len;
@@ -912,6 +954,29 @@ TarInsertRecoveryConf(TarStream *stream)/*
+ * Write a directory into the tar stream.
+ */
+static void
+TarInsertDirectory(TarStream *stream, const char *path, mode_t mode)
+{
+    char       hdr[512];
+
+    /* set file type to directory */
+    mode = (mode & 07777) | 040000;
+
+    if (tarCreateHeader(hdr, path, NULL, 0,
+                        mode, 04000, 02000, time(NULL)) != TAR_OK)
+    {
+        fprintf(stderr, _("%s: filename too long for tar format: %s"),
+                progname, path);
+        disconnect_and_exit(1);
+    }
+
+    writeTarData(stream, hdr, 512);
+}
+
+
+/* * Process the individual files inside the TAR stream and pass their headers * to a callback which can modify or
choseto skip them. The stream consists * of a header and zero or more chunks, all 512 bytes long. The stream from
 
@@ -1026,6 +1091,52 @@ TarIterSkipRecoveryConf(char *h, void *arg)/*
+ * Adjusts the filename in the tar header pointed to by the first argument to
+ * place it under the directory in the second argument.
+ *
+ * Intended for use as a callback for IterateAndWriteTar() when iterating over
+ * a non-default tablespace's files. The callback argument should be the base
+ * tablespace directory.
+ */
+static bool
+TarIterRenameForTablespace(char *h, void *arg)
+{
+    const char *prefix = arg;
+    char        full[MAXPGPATH];
+    size_t        len;
+
+    /*
+     * Build a new path out of the tablespace prefix and the existing name in
+     * the header.
+     */
+    strlcpy(full, prefix, sizeof(full));
+    strlcat(full, "/", sizeof(full));
+    len = strlen(full);
+    if (tarHeaderGetName(h, &full[len], sizeof(full) - len) >= sizeof(full) - len)
+    {
+        tarHeaderGetName(h, full, sizeof(full));
+        fprintf(stderr, _("%s: filename too long: %s/%s"),
+                progname, prefix, full);
+        disconnect_and_exit(1);
+    }
+
+    /* clean up the path to remove /./ and such */
+    canonicalize_path(full);
+
+    /* Store the new name in the tar header */
+    if (tarHeaderRename(h, full) != TAR_OK)
+    {
+        fprintf(stderr, _("%s: filename too long for tar format: %s"),
+                progname, full);
+        disconnect_and_exit(1);
+    }
+
+    /* Never skip any files */
+    return false;
+}
+
+
+/* * Open a (possibly zlib-compressed) tar file for writing. The filebase * argument should be the desired filename
relativeto basedir, without a .tar * or .tar.gz file extension. If the user specified a basedir of - then stdout
 
@@ -1110,7 +1221,7 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)             * file, as required by some
tarprograms.             */            if (basetablespace && writerecoveryconf)
 
-                TarInsertRecoveryConf(&stream);
+                TarInsertRecoveryConf(&stream, NULL);            CloseTarFile(&stream);            break;
@@ -1151,6 +1262,108 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)}
+static bool
+tablespaceRenameAndSkipRecoveryConf(char *h, void *arg)
+{
+    if (TarIterSkipRecoveryConf(h, NULL))
+        return true;
+    TarIterRenameForTablespace(h, arg);
+    return false;
+}
+
+
+/*
+ * Receive a tar format file from the connection to the server, and write
+ * the data from this file directly into a tar file. If compression is
+ * enabled, the data will be compressed while written to the file.
+ *
+ * The file must have already been opened by the caller.
+ *
+ * No attempt to inspect or validate the contents of the file is done.
+ */
+static void
+ReceiveAndAppendTarFile(TarStream *stream, PGconn *conn, PGresult *res, int rownum)
+{
+    const char *prefix;
+    char       *copybuf = NULL;
+    int            buflen;
+    bool        basetablespace = PQgetisnull(res, rownum, 0);
+    TarParser    parser;
+
+    MemSet(&parser, 0, sizeof(parser));
+    parser.stream = stream;
+    parser.in_tarhdr = true;
+
+    /* Get the directory prefix for the tablespace */
+    prefix = get_tablespace_mapping(basetablespace ? default_basetablespace :
+                                    PQgetvalue(res, rownum, 1));
+
+    if (!basetablespace)
+        TarInsertDirectory(stream, prefix, 0700);
+
+    /*
+     * Get the COPY data stream
+     */
+    res = PQgetResult(conn);
+    if (PQresultStatus(res) != PGRES_COPY_OUT)
+    {
+        fprintf(stderr, _("%s: could not get COPY data stream: %s"),
+                progname, PQerrorMessage(conn));
+        disconnect_and_exit(1);
+    }
+
+    while ((buflen = PQgetCopyData(conn, ©buf, 0)) >= 0)
+    {
+        if (basetablespace && writerecoveryconf)
+        {
+            /*
+             * Look for a recovery.conf in the existing tar stream. If it's
+             * there, we must skip it so we can later overwrite it with our
+             * own version of the file.
+             */
+            IterateAndWriteTar(&parser, copybuf, buflen,
+                               tablespaceRenameAndSkipRecoveryConf, (void *)prefix);
+        }
+        else
+        {
+            /*
+             * Rename files in non-base tablespaces into the correct
+             * tablespace directory.
+             */
+            IterateAndWriteTar(&parser, copybuf, buflen,
+                               TarIterRenameForTablespace, (void *)prefix);
+        }
+
+        if (copybuf != NULL)
+        {
+            PQfreemem(copybuf);
+            copybuf = NULL;
+        }
+
+        totaldone += buflen;
+        progress_report(rownum, stream->path, false);
+    }                            /* while (1) */
+
+    if (buflen == -1)
+    {
+        /*
+         * End of chunk. If requested, and this is the base tablespace,
+         * write recovery.conf into the tarfile.
+         */
+        if (basetablespace && writerecoveryconf)
+            TarInsertRecoveryConf(stream, prefix);
+    }
+    else if (buflen == -2)
+    {
+        fprintf(stderr, _("%s: could not read COPY data: %s"),
+                progname, PQerrorMessage(conn));
+        disconnect_and_exit(1);
+    }
+
+    progress_report(rownum, stream->path, true);
+}
+
+/* * Retrieve tablespace path, either relocated or original depending on whether * -T was passed or not.
@@ -1648,6 +1861,7 @@ BaseBackup(void)    int            minServerMajor,                maxServerMajor;    int
 serverMajor;
 
+    TarStream    tarfile;    /*     * Connect in replication mode to the server
@@ -1683,6 +1897,9 @@ BaseBackup(void)        disconnect_and_exit(1);    }
+    if (format == 's')
+        default_basetablespace = get_base_tablespace_path();
+    /*     * Build contents of recovery.conf if requested     */
@@ -1711,7 +1928,7 @@ BaseBackup(void)                 fastcheckpoint ? "FAST" : "",                 includewal ?
"NOWAIT": "",                 maxrate_clause ? maxrate_clause : "",
 
-                 format == 't' ? "TABLESPACE_MAP" : "");
+                 (format == 't' || format == 's') ? "TABLESPACE_MAP" : "");    if (PQsendQuery(conn, basebkp) == 0)
{
@@ -1802,6 +2019,9 @@ BaseBackup(void)        fprintf(stderr,                _("%s: can only write single tablespace to
stdout,database has %d\n"),                progname, PQntuples(res));
 
+        fprintf(stderr,
+                _("HINT: the -F single-tar option always writes a single tar file which may\n"
+                  "be written to stdout.\n"));        disconnect_and_exit(1);    }
@@ -1817,6 +2037,9 @@ BaseBackup(void)        StartLogStreamer(xlogstart, starttli, sysidentifier);    }
+    if (format == 's')
+        OpenTarFile(&tarfile, outpath);
+    /*     * Start receiving chunks     */
@@ -1824,10 +2047,15 @@ BaseBackup(void)    {        if (format == 't')            ReceiveTarFile(conn, res, i);
+        else if (format == 's')
+            ReceiveAndAppendTarFile(&tarfile, conn, res, i);        else            ReceiveAndUnpackTarFile(conn, res,
i);   }                            /* Loop over all tablespaces */
 
+    if (format == 's')
+        CloseTarFile(&tarfile);
+    if (showprogress)    {        progress_report(PQntuples(res), NULL, true);
@@ -1982,6 +2210,7 @@ main(int argc, char **argv)        {"help", no_argument, NULL, '?'},        {"version",
no_argument,NULL, 'V'},        {"pgdata", required_argument, NULL, 'D'},
 
+        {"file", required_argument, NULL, 'f'},        {"format", required_argument, NULL, 'F'},        {"checkpoint",
required_argument,NULL, 'c'},        {"max-rate", required_argument, NULL, 'r'},
 
@@ -2027,7 +2256,7 @@ main(int argc, char **argv)        }    }
-    while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:S:wWvP",
+    while ((c = getopt_long(argc, argv, "D:f:F:r:RT:xX:l:zZ:d:c:h:p:U:s:S:wWvP",
long_options,&option_index)) != -1)    {        switch (c)
 
@@ -2035,15 +2264,20 @@ main(int argc, char **argv)            case 'D':                basedir = pg_strdup(optarg);
           break;
 
+            case 'f':
+                outpath = pg_strdup(optarg);
+                break;            case 'F':                if (strcmp(optarg, "p") == 0 || strcmp(optarg, "plain") ==
0)                   format = 'p';                else if (strcmp(optarg, "t") == 0 || strcmp(optarg, "tar") == 0)
             format = 't';
 
+                else if (strcmp(optarg, "s") == 0 || strcmp(optarg, "single-tar") == 0)
+                    format = 's';                else                {                    fprintf(stderr,
-                            _("%s: invalid output format \"%s\", must be \"plain\" or \"tar\"\n"),
+                            _("%s: invalid output format \"%s\", must be \"plain\", \"tar\" or \"single-tar\"\n"),
                      progname, optarg);                    exit(1);                }
 
@@ -2190,12 +2424,34 @@ main(int argc, char **argv)    /*     * Required arguments     */
-    if (basedir == NULL)
+    if (format == 's')    {
-        fprintf(stderr, _("%s: no target directory specified\n"), progname);
-        fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
-                progname);
-        exit(1);
+        if (basedir != NULL)
+        {
+            fprintf(stderr, _("%s: target directory cannot be specified in single-tar mode\n"), progname);
+            fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
+                    progname);
+            exit(1);
+        }
+        if (outpath == NULL)
+            outpath = pg_strdup("-");
+    }
+    else
+    {
+        if (basedir == NULL)
+        {
+            fprintf(stderr, _("%s: no target directory specified\n"), progname);
+            fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
+                    progname);
+            exit(1);
+        }
+        if (outpath != NULL)
+        {
+            fprintf(stderr, _("%s: target file can only be specified in single-tar mode\n"), progname);
+            fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
+                    progname);
+            exit(1);
+        }    }    /*
@@ -2270,7 +2526,7 @@ main(int argc, char **argv)     * backups, always require the directory. For tar backups, require
it    * unless we are writing to stdout.     */
 
-    if (format == 'p' || strcmp(basedir, "-") != 0)
+    if (format == 'p' || (basedir != NULL && strcmp(basedir, "-")) != 0)
verify_dir_is_empty_or_create(basedir);   /* Create transaction log symlink, if required */
 
-- 
2.3.0




Re: Add pg_basebackup single tar output format

From
Michael Paquier
Date:
On Wed, Sep 30, 2015 at 7:16 AM, Joshua Elsasser <josh@idealist.org> wrote:
> On Tue, Sep 29, 2015 at 11:50:37PM +0200, Andres Freund wrote:
>> On 2015-09-29 14:38:11 -0700, Josh Elsasser wrote:
>> > I've put my changes up as a series of relatively small commits on this
>> > branch of a github fork:
>> >
>> > https://github.com/jre/postgres/commits/single-tar
>> >
>> > Comments and suggestions are welcome.

Great!

>>
>> Please post them to the list, we want patches to be in our archives.
>
> Sure, let's see if I can figure out git send-email

Usually it is recommended to add them directly to their dedicated
thread so as it is easier to follow the review flow, and you have sent
six patches, that's as much threads to follow... Here are some general
guidelines:
https://wiki.postgresql.org/wiki/Submitting_a_Patch

You should register your patch as well here so as we do not lose track of it:
https://commitfest.postgresql.org/7/
Regards,
-- 
Michael



On Tue, Sep 29, 2015 at 6:16 PM, Joshua Elsasser <josh@idealist.org> wrote:
> ---

Hi!

Thanks for submitting patches to the PostgreSQL community.  We need
more developers, and appreciate contributions.  However, I'm somewhat
flummoxed by this particular patch series, because (1) there's no real
explanation of what these patches are trying to accomplish (e.g. this
one has zero explanation other than the subject line), and (2) they
seem to be a mix of bug fixes and new features and general
refactoring.  Bug fixes we will want to back-patch to all supported
releases, but features should just go into master.  It's helpful if
you can explain what falls into which category.

Do some of these patches have dependencies on others?  If so, it would
be helpful to explain that.  I'm not inclined to bother with the
refactoring patches unless they enable some other thing that we agree
is a worthwhile goal.  Moving 20 lines of code into a function is
something that could be done in a lot of places in our source base,
but it doesn't seem like a worthy goal unto itself.

It would probably be a good idea to review
https://wiki.postgresql.org/wiki/Submitting_a_Patch -- the Linux style
of patch submission is generally not preferred here; we like to
discuss closely-related patches as a group, on a single thread, and
separate patches on completely separate threads.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Wed, Sep 30, 2015 at 10:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> It would probably be a good idea to review
> https://wiki.postgresql.org/wiki/Submitting_a_Patch -- the Linux style
> of patch submission is generally not preferred here; we like to
> discuss closely-related patches as a group, on a single thread, and
> separate patches on completely separate threads.

FWIW base thread is here:
http://www.postgresql.org/message-id/20150929213811.GF22290@idealist.org
-- 
Michael