Re: pg_basebackup's --gzip switch misbehaves - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: pg_basebackup's --gzip switch misbehaves
Date
Msg-id YxQ1tIYF8hbrNO7q@paquier.xyz
Whole thread Raw
In response to pg_basebackup's --gzip switch misbehaves  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pg_basebackup's --gzip switch misbehaves
List pgsql-hackers
On Sat, Sep 03, 2022 at 11:11:29AM -0400, Tom Lane wrote:
> It makes sense that base.tar.gz is compressed a little better with
> --gzip than with level-1 compression, but why is pg_wal.tar.gz not
> compressed at all?  It looks like the problem probably boils down to
> which of "-1" and "0" means "default behavior" vs "no compression",
> with different code layers interpreting that differently.  I can't
> find exactly where that's happening, but I did manage to stop the
> failures with this crude hack:

There is a distinction coming in pg_basebackup.c from the way we
deparse the compression specification and the default compression
level that should be assigned if there is no level directly specified
by the user.  It seems to me that the error comes from this code in
BaseBackup() when we are under STREAM_WAL (default):

    if (client_compress->algorithm == PG_COMPRESSION_GZIP)
    {
        wal_compress_algorithm = PG_COMPRESSION_GZIP;
        wal_compress_level =
           (client_compress->options & PG_COMPRESSION_OPTION_LEVEL)
           != 0 ? client_compress->level : 0;

ffd5365 has missed that wal_compress_level should be set to
Z_DEFAULT_COMPRESSION if there is nothing set in the compression
spec for a zlib build.  pg_receivewal.c enforces that already.

> That's not right as a real fix, because it would have the effect
> that "--compress gzip:0" would also invoke default compression,
> whereas what it should do is produce the uncompressed output
> we're actually getting.  Both cases have compression_level == 0
> by the time we reach here, though.

Nope, that would not be right.

> BTW, I'm fairly astonished that anyone would have thought that three
> complete pg_basebackup cycles testing essentially-identical options
> were a good use of developer time and buildfarm cycles from here to
> eternity.  Even if digging into it did expose a bug, the test case
> deserves little credit for that, because it entirely failed to call
> attention to the problem.  I had to whack the script pretty hard
> just to get it to not delete the evidence.

The introduction of the compression specification has introduced a lot
of patterns where we expect or not expect compression to happen, and
on top of that this needs to be careful about backward-compatibility.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: build remaining Flex files standalone
Next
From: Michael Paquier
Date:
Subject: Re: freeing LDAPMessage in CheckLDAPAuth