Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From gkokolatos@pm.me
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id qSzzmYsXqUf1gDvFspOEbqoE8U_F8IsphDG8c25VZ5KTXimQVYfLvnCcuU7_X2raa2SzZinDF0NpDIKSOALRUYP8pu_NL5QJwIDdR_RT-4U=@pm.me
Whole thread Raw
In response to Re: Add LZ4 compression in pg_dump  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Add LZ4 compression in pg_dump
List pgsql-hackers




------- Original Message -------
On Sunday, February 26th, 2023 at 3:59 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:


>
>
>
>
> On 2/25/23 06:02, Justin Pryzby wrote:
>
> > I have some fixes (attached) and questions while polishing the patch for
> > zstd compression. The fixes are small and could be integrated with the
> > patch for zstd, but could be applied independently.
> >
> > - I'm unclear about get_error_func(). That's called in three places
> > from pg_backup_directory.c, after failures from write_func(), to
> > supply an compression-specific error message to pg_fatal(). But it's
> > not being used outside of directory format, nor for errors for other
> > function pointers, or even for all errors in write_func(). Is there
> > some reason why each compression method's write_func() shouldn't call
> > pg_fatal() directly, with its compression-specific message ?
>
>
> I think there are a couple more places that might/should call
> get_error_func(). For example ahwrite() in pg_backup_archiver.c now
> simply does
>
> if (bytes_written != size * nmemb)
> WRITE_ERROR_EXIT;
>
> but perhaps it should call get_error_func() too. There are probably
> other places that call write_func() and should use get_error_func().

Agreed, calling get_error_func() would be preferable to a fatal error. It
should be the caller of the api who decides how to proceed.

>
> > - I still think supports_compression() should be renamed, or made into a
> > static function in the necessary file. The main reason is that it's
> > more clear what it indicates - whether compression is "implemented by
> > pgdump" and not whether compression is "supported by this postgres
> > build". It also seems possible that we'd want to add a function
> > called something like supports_compression(), indicating whether the
> > algorithm is supported by the current build. It'd be better if pgdump
> > didn't subjugate that name.
>
>
> If we choose to rename this to have pgdump_ prefix, fine with me. But I
> don't think there's a realistic chance of conflict, as it's restricted
> to pgdump header etc. And it's not part of an API, so I guess we could
> rename that in the future if needed.

Agreed, it is very unrealistic that one will include that header file anywhere
but within pg_dump. Also. I think that adding a prefix, "pgdump", "pg_dump",
or similar does not add value and subtracts readability.

>
> > - Finally, the "Nothing to do in the default case" comment comes from
> > Michael's commit 5e73a6048:
> >
> > + /*
> > + * Custom and directory formats are compressed by default with gzip when
> > + * available, not the others.
> > + /
> > + if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
> > + !user_compression_defined)
> > {
> > #ifdef HAVE_LIBZ
> > - if (archiveFormat == archCustom || archiveFormat == archDirectory)
> > - compressLevel = Z_DEFAULT_COMPRESSION;
> > - else
> > + parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
> > + &compression_spec);
> > +#else
> > + / Nothing to do in the default case */
> > #endif
> > - compressLevel = 0;
> > }
> >
> > As the comment says: for -Fc and -Fd, the compression is set to zlib, if
> > enabled, and when not otherwise specified by the user.
> >
> > Before 5e73a6048, this set compressLevel=0 for -Fp and -Ft, and when
> > zlib was unavailable.
> >
> > But I'm not sure why there's now an empty "#else". I also don't know
> > what "the default case" refers to.
> >
> > Maybe the best thing here is to move the preprocessor #if, since it's no
> > longer in the middle of a runtime conditional:
> >
> > #ifdef HAVE_LIBZ
> > + if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
> > + !user_compression_defined)
> > + parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
> > + &compression_spec);
> > #endif
> >
> > ...but that elicits a warning about "variable set but not used"...
>
>
> Not sure, I need to think about this a bit.

Not having warnings is preferable, isn't it? I can understand the confusion
on the message though. Maybe a phrasing like:
/* Nothing to do for the default case when LIBZ is not available */
is easier to understand.

Cheers,
//Georgios

>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Next
From: Jelte Fennema
Date:
Subject: Re: libpq: PQgetCopyData() and allocation overhead