Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id 20230125014203.GL13860@telsasoft.com
Whole thread Raw
In response to Re: Add LZ4 compression in pg_dump  (gkokolatos@pm.me)
Responses Re: Add LZ4 compression in pg_dump
List pgsql-hackers
On Tue, Jan 24, 2023 at 03:56:20PM +0000, gkokolatos@pm.me wrote:
> On Monday, January 23rd, 2023 at 7:00 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > On Mon, Jan 23, 2023 at 05:31:55PM +0000, gkokolatos@pm.me wrote:
> > 
> > > Please find attached v23 which reintroduces the split.
> > > 
> > > 0001 is reworked to have a reduced footprint than before. Also in an attempt
> > > to facilitate the readability, 0002 splits the API's and the uncompressed
> > > implementation in separate files.
> > 
> > Thanks for updating the patch. Could you address the review comments I
> > sent here ?
> > https://www.postgresql.org/message-id/20230108194524.GA27637%40telsasoft.com
> 
> Please find v24 attached.

Thanks for updating the patch.

In 001, RestoreArchive() does:

> -#ifndef HAVE_LIBZ
> -       if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
> -               AH->PrintTocDataPtr != NULL)
> +       supports_compression = false;
> +       if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE ||
> +               AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
> +               supports_compression = true;
> +
> +       if (AH->PrintTocDataPtr != NULL)
>         {
>                 for (te = AH->toc->next; te != AH->toc; te = te->next)
>                 {
>                         if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
> -                               pg_fatal("cannot restore from compressed archive (compression not supported in this
installation)");
> +                       {
> +#ifndef HAVE_LIBZ
> +                               if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
> +                                       supports_compression = false;
> +#endif
> +                               if (supports_compression == false)
> +                                       pg_fatal("cannot restore from compressed archive (compression not supported
inthis installation)");
 
> +                       }
>                 }
>         }
> -#endif

This first checks if the algorithm is implemented, and then checks if
the algorithm is supported by the current build - that confused me for a
bit.  It seems unnecessary to check for unimplemented algorithms before
looping.  That also requires referencing both GZIP and LZ4 in two
places.

I think it could be written to avoid the need to change for added
compression algorithms:

+                       if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
+                       {
+                               /* Check if the compression algorithm is supported */
+                               pg_compress_specification spec;
+                               parse_compress_specification(AH->compression_spec.algorithm, NULL, &spec);
+                               if (spec->parse_error != NULL)
+                                       pg_fatal(spec->parse_error);
+                       }

Or maybe add a new function to compression.c to indicate whether a given
algorithm is supported.

That would also indicate *which* compression library isn't supported.

Other than that, I think 001 is ready.

002/003 use these names, which I think are too similar - initially I
didn't even realize there were two separate functions (each with a
second stub function to handle the case of unsupported compression):

+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec);
                                                                                                    
 
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
                                                                                                        
 

+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
                                                                                                          
 
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
                                                                                                          
 

typo:
s/not build with/not built with/

Should AllocateCompressor() set cs->compression_spec, rather than doing
it in each compressor ?

Thanks for considering.

-- 
Justin



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Improve logging when using Huge Pages
Next
From: Masahiko Sawada
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum