Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id 20230125180020.GF22427@telsasoft.com
Whole thread Raw
In response to Re: Add LZ4 compression in pg_dump  (gkokolatos@pm.me)
Responses Re: Add LZ4 compression in pg_dump
Re: Add LZ4 compression in pg_dump
List pgsql-hackers
On Wed, Jan 25, 2023 at 03:37:12PM +0000, gkokolatos@pm.me wrote:
> Of course, one can throw the error before entering the loop, yet I think
> that it does not help the readability of the code. IMHO it is easier to
> follow if the error is thrown once during that check.

> If anything, I can suggest to throw an error much earlier, i.e. in ReadHead(),
> and remove altogether this check. On the other hand, I like the belts
> and suspenders approach because there are no more checks after this point.

While looking at this, I realized that commit 5e73a6048 introduced a
regression:

@@ -3740,19 +3762,24 @@ ReadHead(ArchiveHandle *AH)

-       if (AH->compression != 0)
-               pg_log_warning("archive is compressed, but this installation does not support compression -- no data
willbe available");
 
+       if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+               pg_fatal("archive is compressed, but this installation does not support compression");

Before, it was possible to restore non-data chunks of a dump file, even
if the current build didn't support its compression.  But that's now
impossible - and it makes the code we're discussing in RestoreArchive()
unreachable.

I don't think we can currently test for that, since it requires creating a dump
using a build --with compression and then trying to restore using a build
--without compression.  The coverage report disagrees with me, though...
https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_archiver.c.gcov.html#3901

> > I think it could be written to avoid the need to change for added
> > compression algorithms:
...
> 
> I am not certain how that would work in the example with ZSTD above.
> If I am not wrong, parse_compress_specification() will not throw an error
> if the codebase supports ZSTD, yet this specific pg_dump binary will not
> support it because ZSTD is not implemented. parse_compress_specification()
> is not aware of that and should not be aware of it, should it?

You're right.

I think the 001 patch should try to remove hardcoded references to
LIBZ/GZIP, such that the later patches don't need to update those same
places for LZ4.  For example in ReadHead() and RestoreArchive(), and
maybe other places dealing with file extensions.  Maybe that could be
done by adding a function specific to pg_dump indicating whether or not
an algorithm is implemented and supported.

-- 
Justin



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: wake up logical workers after ALTER SUBSCRIPTION
Next
From: Nathan Bossart
Date:
Subject: Re: pgsql: Rename contrib module basic_archive to basic_wal_module