Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From gkokolatos@pm.me
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id f4qyEM_TZClMk3_dC2Joc7KELxdu0j-otmT-AHJy25vpCkkXp5aGuZ08TwabgHTieiIqYwgtnBKe7muIjHIwHKn4ZimuKc62rualr8FJ9gg=@pm.me
Whole thread Raw
In response to Re: Add LZ4 compression in pg_dump  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Add LZ4 compression in pg_dump  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers




------- Original Message -------
On Wednesday, March 29th, 2023 at 12:02 AM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:


>
>
> On 3/28/23 18:07, gkokolatos@pm.me wrote:
>
> > ------- Original Message -------
> > On Friday, March 24th, 2023 at 10:30 AM, gkokolatos@pm.me gkokolatos@pm.me wrote:
> >
> > > ------- Original Message -------
> > > On Thursday, March 23rd, 2023 at 6:10 PM, Tomas Vondra tomas.vondra@enterprisedb.com wrote:
> > >
> > > > This leaves the empty-data issue (which we have a fix for) and the
> > > > switch to LZ4F. And then the zstd part.
> > >
> > > Please expect promptly a patch for the switch to frames.
> >
> > Please find the expected patch attached. Note that the bulk of the
> > patch is code unification, variable renaming to something more
> > appropriate, and comment addition. These are changes that are not
> > strictly necessary to switch to LZ4F. I do believe that are
> > essential for code hygiene after the switch and they do belong
> > on the same commit.
>
>
> I think the patch is fine, but I'm wondering if the renames shouldn't go
> a bit further. It removes references to LZ4File struct, but there's a
> bunch of functions with LZ4File_ prefix. Why not to simply use LZ4_
> prefix? We don't have GzipFile either.
>
> Sure, it might be a bit confusing because lz4.h uses LZ4_ prefix, but
> then we probably should not define LZ4_compressor_init ...

This is a good point. The initial thought was that since lz4.h is now
removed, such ambiguity will not be present. In v2 of the patch the
function is renamed to `LZ4State_compression_init` since this name
describes better its purpose. It initializes the LZ4State for
compression.

As for the LZ4File_ prefix, I have no objections. Please find the
prefix changed to LZ4Stream_. For the record, the word 'File' is not
unique to the lz4 implementation. The common data structure used by
the API in compress_io.h:

   typedef struct CompressFileHandle CompressFileHandle;

The public functions for this API are named:

  InitCompressFileHandle
  InitDiscoverCompressFileHandle
  EndCompressFileHandle

And within InitCompressFileHandle the pattern is:

    if (compression_spec.algorithm == PG_COMPRESSION_NONE)
        InitCompressFileHandleNone(CFH, compression_spec);
    else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
        InitCompressFileHandleGzip(CFH, compression_spec);
    else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
        InitCompressFileHandleLZ4(CFH, compression_spec);

It was felt that a prefix was required due to the inclusion 'lz4.h'
header where naming functions as 'LZ4_' would be wrong. The prefix
'LZ4File_' seemed to be in line with the naming of the rest of
the relevant functions and structures. Other compressions, gzip and
none, did not face the same issue.

To conclude, I think that having a prefix is slightly preferred
over not having one. Since the prefix `LZ4File_` is not desired,
I propose `LZ4Stream_` in v2.

I will not object to dismissing the argument and drop `File` from
the prefix, if so requested.

>
> Also, maybe the comments shouldn't use "File API" when compress_io.c
> calls that "Compressed stream API".

Done.

Cheers,
//Georgios

>
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
Next
From: Amit Kapila
Date:
Subject: Re: PGdoc: add ID attribute to create_publication.sgml