Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id 20230116015625.GH9837@telsasoft.com
Whole thread Raw
In response to Re: Add LZ4 compression in pg_dump  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Add LZ4 compression in pg_dump
Re: Add LZ4 compression in pg_dump
List pgsql-hackers
On Mon, Jan 16, 2023 at 10:28:50AM +0900, Michael Paquier wrote:
> On Sat, Jan 14, 2023 at 03:43:09PM -0600, Justin Pryzby wrote:
> > On Sun, Jan 08, 2023 at 01:45:25PM -0600, Justin Pryzby wrote:
> >> pg_compress_specification is being passed by value, but I think it
> >> should be passed as a pointer, as is done everywhere else.
> > 
> > ISTM that was an issue with 5e73a6048, affecting a few public and
> > private functions.  I wrote a pre-preparatory patch which changes to
> > pass by reference.
> 
> The functions changed by 0001 are cfopen[_write](),
> AllocateCompressor() and ReadDataFromArchive().  Why is it a good idea
> to change these interfaces which basically exist to handle inputs?

I changed to pass pg_compress_specification as a pointer, since that's
the usual convention for structs, as followed by the existing uses of
pg_compress_specification.

> Is there some benefit in changing compression_spec within the
> internals of these routines before going back one layer down to their
> callers?  Changing the compression_spec on-the-fly in these internal
> paths could be risky, actually, no?

I think what you're saying is that if the spec is passed as a pointer,
then the called functions shouldn't set spec->algorithm=something.

I agree that if they need to do that, they should use a local variable.
Which looks to be true for the functions that were changed in 001.

> > And addressed a handful of other issues I reported as separate fixup
> > commits.  And changed to use LZ4 by default for CI.
> 
> Are your slight changes shaped as of 0003-f.patch, 0005-f.patch and
> 0007-f.patch on top of the original patches sent by Georgios?

Yes, the original patches, rebased as needed on top of HEAD and 001...

> >> pg_compress_algorithm is being writen directly into the pg_dump header.
> 
> Do you mean that this is what happens once the patch series 0001~0008
> sent upthread is applied on HEAD?

Yes

> -   /*
> -    * For now the compression type is implied by the level.  This will need
> -    * to change once support for more compression algorithms is added,
> -    * requiring a format bump.
> -    */
> -   WriteInt(AH, AH->compression_spec.level);
> +   AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
> 
> I may be missing something here, but it seems to me that you ought to
> store as well the level in the dump header, or it would not be
> possible to report in the dump's description what was used?  Hence,
> K_VERS_1_15 should imply that we have both the method compression and
> the compression level.

Maybe.  But the "level" isn't needed for decompression for any case I'm
aware of.

Also, dumps with the default compression level currently say:
"Compression: -1", which does't seem valuable.

-- 
Justin



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: constify arguments of copy_file() and copydir()
Next
From: gkokolatos@pm.me
Date:
Subject: Re: Add LZ4 compression in pg_dump