Re: refactoring basebackup.c - Mailing list pgsql-hackers

From Robert Haas
Subject Re: refactoring basebackup.c
Date
Msg-id CA+Tgmob6Rnjz-Qv32h3yJn8nnUkLhrtQDAS4y5AtsgtorAFHRA@mail.gmail.com
Whole thread Raw
In response to Re: refactoring basebackup.c  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: refactoring basebackup.c  (Andres Freund <andres@anarazel.de>)
Re: refactoring basebackup.c  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On Thu, Mar 10, 2022 at 8:02 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> I'm getting errors from pg_basebackup when using both -D- and --compress=server-*
> The issue seems to go away if I use --no-manifest.
>
> $ ./src/bin/pg_basebackup/pg_basebackup -h /tmp -Ft -D- --wal-method none --compress=server-gzip >/dev/null ; echo
$?
> pg_basebackup: error: tar member has empty name
> 1
>
> $ ./src/bin/pg_basebackup/pg_basebackup -h /tmp -Ft -D- --wal-method none --compress=server-gzip >/dev/null ; echo
$?
> NOTICE:  WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means
tocomplete the backup
 
> pg_basebackup: error: COPY stream ended before last file was finished
> 1

Thanks for the report. The problem here is that, when the output is
standard output (-D -), pg_basebackup can only produce a single output
file, so the manifest gets injected into the tar file on the client
side rather than being written separately as we do in normal cases.
However, that only works if we're receiving a tar file that we can
parse from the server, and here the server is sending a compressed
tarfile. The current code mistakely attempts to parse the compressed
tarfile as if it were an uncompressed tarfile, which causes the error
messages that you are seeing (and which I can also reproduce here). We
actually have enough infrastructure available in pg_basebackup now
that we could do the "right thing" in this case: decompress the data
received from the server, parse the resulting tar file, inject the
backup manifest, construct a new tar file, and recompress. However, I
think that's probably not a good idea, because it's unlikely that the
user will understand that the data is being compressed on the server,
then decompressed, and then recompressed again, and the performance of
the resulting pipeline will probably not be very good. So I think we
should just refuse this command. Patch for that attached.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Sharma
Date:
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Next
From: Stephen Frost
Date:
Subject: Re: role self-revocation