Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From gkokolatos@pm.me
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id bZeDkb5raMHiji1UIjvXjVIwc5qCHcdAb5h3CXXp85D8Mjwrm4kIGPh8UVbc2sJlfKgx2KsDHSrF35JAr-kFW5GJOx8YQlLKLv5vmAYLEjQ=@pm.me
Whole thread Raw
In response to Add LZ4 compression in pg_dump  (Georgios <gkokolatos@protonmail.com>)
Responses Re: Add LZ4 compression in pg_dump  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
------- Original Message -------

On Wednesday, March 30th, 2022 at 7:54 AM, Michael Paquier <michael@paquier.xyz> wrote:

> On Tue, Mar 29, 2022 at 09:46:27AM +0000, gkokolatos@pm.me wrote:
> > On Tuesday, March 29th, 2022 at 9:27 AM, Michael Paquier michael@paquier.xyz wrote:
> > > On Sat, Mar 26, 2022 at 01:14:41AM -0500, Justin Pryzby wrote:
> > > + command_fails_like(
> > > + [ 'pg_dump', '--compress', '1', '--format', 'tar' ],
> > > This addition depending on HAVE_LIBZ is a good thing as a reminder of
> > > any work that could be done in 0002. Now that's waiting for 20 years
> > > so I would not hold my breath on this support. I think that this
> > > could be just applied first, with 0002 on top of it, as a first
> > > improvement.
> >
> > Excellent, thank you.
>
> I have applied the test for --compress and --format=tar, separating it
> from the rest.

Thank you.

> While moving on with 0002, I have noticed the following in
>
> _StartBlob():
>     if (AH->compression != 0)
>         sfx = ".gz";
>     else
>         sfx = "";
>
> Shouldn't this bit also be simplified, adding a fatal() like the other
> code paths, for safety?

Agreed. Fixed.

> > > + compress_cmd => [
> > > + $ENV{'GZIP_PROGRAM'},
> > > + '-f',
> > > [...]
> > > + $ENV{'GZIP_PROGRAM'},
> > > + '-k', '-d',
> > > -f and -d are available everywhere I looked at, but is -k/--keep a
> > > portable choice with a gzip command? I don't see this option in
> > > OpenBSD, for one. So this test is going to cause problems on those
> > > buildfarm machines, at least. Couldn't this part be replaced by a
> > > simple --test to check that what has been compressed is in correct
> > > shape? We know that this works, based on our recent experiences with
> > > the other tests.
> >
> > I would argue that the simple '--test' will not do in this case, as the
> > TAP tests do need a file named <test>.sql to compare the contents with.
> > This file is generated either directly by pg_dump itself, or by running
> > pg_restore on pg_dump's output. In the case of compression pg_dump will
> > generate a <test>.sql.<compression program suffix> which can not be
> > used in the comparison tests. So the intention of this block, is not to
> > simply test for validity, but to also decompress pg_dump's output for it
> > to be able to be used.
>
> Ahh, I see, thanks. I would add a comment about that in the area of
> compression_gzip_plain_format.

Agreed. Comment added.

> + my $supports_compression = check_pg_config("#define HAVE_LIBZ 1");
>
> This part could be moved within the if block a couple of lines down.

I moved it instead out of the for loop above to not have to call it on
each iteration.

> + my $compress_program = $ENV{GZIP_PROGRAM};
>
> It seems to me that it is enough to rely on {compress_cmd}, hence
> there should be no need for $compress_program, no?

Maybe not. We don't want to the tests to fail if the utility is not
installed. That becomes even more evident as more methods are added.
However I realized that the presence of the environmental variable does
not guarrantee that the utility is actually installed. In the attached,
the existance of the utility is based on the return value of system_log().

> It seems to me that we should have a description for compress_cmd at
> the top of 002_pg_dump.pl (close to "Definition of the pg_dump runs to
> make"). There is an order dependency with restore_cmd.

Agreed. Comment added.

Cheers,
//Georgios
Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Extend compatibility of PostgreSQL::Test::Cluster
Next
From: Robert Haas
Date:
Subject: Re: Correct docs re: rewriting indexes when table rewrite is skipped