Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id 15b255d0-8868-a020-f6a9-7324f82a28b6@enterprisedb.com
Whole thread Raw
In response to Re: Add LZ4 compression in pg_dump  (gkokolatos@pm.me)
Responses Re: Add LZ4 compression in pg_dump
List pgsql-hackers
On 1/25/23 16:37, gkokolatos@pm.me wrote:
> 
> 
> 
> 
> 
> ------- Original Message -------
> On Wednesday, January 25th, 2023 at 2:42 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> 
> 
>>
>>
>> On Tue, Jan 24, 2023 at 03:56:20PM +0000, gkokolatos@pm.me wrote:
>>
>>> On Monday, January 23rd, 2023 at 7:00 PM, Justin Pryzby pryzby@telsasoft.com wrote:
>>>
>>>> On Mon, Jan 23, 2023 at 05:31:55PM +0000, gkokolatos@pm.me wrote:
>>>>
>>>>> Please find attached v23 which reintroduces the split.
>>>>>
>>>>> 0001 is reworked to have a reduced footprint than before. Also in an attempt
>>>>> to facilitate the readability, 0002 splits the API's and the uncompressed
>>>>> implementation in separate files.
>>>>
>>>> Thanks for updating the patch. Could you address the review comments I
>>>> sent here ?
>>>> https://www.postgresql.org/message-id/20230108194524.GA27637%40telsasoft.com
>>>
>>> Please find v24 attached.
>>
>>
>> Thanks for updating the patch.
>>
>> In 001, RestoreArchive() does:
>>
>>> -#ifndef HAVE_LIBZ
>>> - if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
>>> - AH->PrintTocDataPtr != NULL)
>>> + supports_compression = false;
>>> + if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE ||
>>> + AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
>>> + supports_compression = true;
>>> +
>>> + if (AH->PrintTocDataPtr != NULL)
>>> {
>>> for (te = AH->toc->next; te != AH->toc; te = te->next)
>>> {
>>> if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
>>> - pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
>>> + {
>>> +#ifndef HAVE_LIBZ
>>> + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
>>> + supports_compression = false;
>>> +#endif
>>> + if (supports_compression == false)
>>> + pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
>>> + }
>>> }
>>> }
>>> -#endif
>>
>>
>> This first checks if the algorithm is implemented, and then checks if
>> the algorithm is supported by the current build - that confused me for a
>> bit. It seems unnecessary to check for unimplemented algorithms before
>> looping. That also requires referencing both GZIP and LZ4 in two
>> places.
> 
> I am not certain that it is unnecessary, at least not in the way that is
> described. The idea is that new compression methods can be added, without
> changing the archive's version number. It is very possible that it is
> requested to restore an archive compressed with a method not implemented
> in the current binary. The first check takes care of that and sets
> supports_compression only for the supported versions. It is possible to
> enter the loop with supports_compression already set to false, for example
> because the archive was compressed with ZSTD, triggering the fatal error.
> 
> Of course, one can throw the error before entering the loop, yet I think
> that it does not help the readability of the code. IMHO it is easier to
> follow if the error is thrown once during that check.
> 

Actually, I don't understand why 0001 moves the check into the loop. I
mean, why not check HAVE_LIBZ before the loop?

>>
>> I think it could be written to avoid the need to change for added
>> compression algorithms:
>>
>> + if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
>>
>> + {
>> + /* Check if the compression algorithm is supported */
>> + pg_compress_specification spec;
>> + parse_compress_specification(AH->compression_spec.algorithm, NULL, &spec);
>>
>> + if (spec->parse_error != NULL)
>>
>> + pg_fatal(spec->parse_error);
>>
>> + }
> 
> I am not certain how that would work in the example with ZSTD above.
> If I am not wrong, parse_compress_specification() will not throw an error
> if the codebase supports ZSTD, yet this specific pg_dump binary will not
> support it because ZSTD is not implemented. parse_compress_specification()
> is not aware of that and should not be aware of it, should it?
> 

Not sure. What happens in a similar situation now? That is, when trying
to deal with an archive gzip-compressed in a build without libz?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: CREATE ROLE bug?
Next
From: Bruce Momjian
Date:
Subject: Re: CREATE ROLE bug?