Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From gkokolatos@pm.me
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id yTrMijGhpyxYVVeJuJPM5NandUq_5Qq647cqNuUpVkXSfbYPDj0VLfRK72jB2HWoeDgRIYIbjG7YbBrUuzv8fRF9Y5vALIrM9BGIgQLh8vM=@pm.me
Whole thread Raw
In response to Re: Add LZ4 compression in pg_dump  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Add LZ4 compression in pg_dump  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
------- Original Message -------

On Tuesday, March 29th, 2022 at 9:27 AM, Michael Paquier <michael@paquier.xyz> wrote:

> On Sat, Mar 26, 2022 at 01:14:41AM -0500, Justin Pryzby wrote:
> > See 0001 and the manpage.
> > + 'pg_dump: compression is not supported by tar archive format');
> > When I submitted a patch to support zstd, I spent awhile trying to make
> > compression work with tar, but it's a significant effort and better done
> > separately.
>
> Wow. This stuff is old enough to vote (c3e18804), dead since its
> introduction. There is indeed an argument for removing that, it is
> not good to keep around that that has never been stressed and/or
> used. Upon review, the cleanup done looks correct, as we have never
> been able to generate .dat.gz files in for a dump in the tar format.

Correct. My driving force behind it was to ease up the cleanup/refactoring
work that follows, by eliminating the callers of the GZ*() macros.

> + command_fails_like(
>
> + [ 'pg_dump', '--compress', '1', '--format', 'tar' ],
> This addition depending on HAVE_LIBZ is a good thing as a reminder of
> any work that could be done in 0002. Now that's waiting for 20 years
> so I would not hold my breath on this support. I think that this
> could be just applied first, with 0002 on top of it, as a first
> improvement.

Excellent, thank you.

> + compress_cmd => [
> + $ENV{'GZIP_PROGRAM'},
> Patch 0001 is missing and update of pg_dump's Makefile to pass down
> this environment variable to the test scripts, no?

Agreed. It was not properly moved forward. Fixed.

> + compress_cmd => [
> + $ENV{'GZIP_PROGRAM'},
> + '-f',
> [...]
> + $ENV{'GZIP_PROGRAM'},
> + '-k', '-d',
> -f and -d are available everywhere I looked at, but is -k/--keep a
> portable choice with a gzip command? I don't see this option in
> OpenBSD, for one. So this test is going to cause problems on those
> buildfarm machines, at least. Couldn't this part be replaced by a
> simple --test to check that what has been compressed is in correct
> shape? We know that this works, based on our recent experiences with
> the other tests.

I would argue that the simple '--test' will not do in this case, as the
TAP tests do need a file named <test>.sql to compare the contents with.
This file is generated either directly by pg_dump itself, or by running
pg_restore on pg_dump's output. In the case of compression pg_dump will
generate a <test>.sql.<compression program suffix> which can not be
used in the comparison tests. So the intention of this block, is not to
simply test for validity, but to also decompress pg_dump's output for it
to be able to be used.

I updated the patch to simply remove the '-k' flag.

Please find v3 attached. (only 0001 and 0002 are relevant, 0003 and
0004 are only for reference and are currently under active modification).

Cheers,
//Georgios

Attachment

pgsql-hackers by date:

Previous
From: Nikita Malakhov
Date:
Subject: Generic JSON API
Next
From: Amit Kapila
Date:
Subject: Re: Column Filtering in Logical Replication