Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id cb3d5dbd-988e-7628-a3b2-d3b4a66729d5@enterprisedb.com
Whole thread Raw
In response to Re: Add LZ4 compression in pg_dump  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers

On 3/2/23 18:18, Justin Pryzby wrote:
> On Wed, Mar 01, 2023 at 05:20:05PM +0100, Tomas Vondra wrote:
>> On 2/25/23 15:05, Justin Pryzby wrote:
>>> On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
>>>> I have some fixes (attached) and questions while polishing the patch for
>>>> zstd compression.  The fixes are small and could be integrated with the
>>>> patch for zstd, but could be applied independently.
>>>
>>> One more - WriteDataToArchiveGzip() says:
>>>
>>> +       if (cs->compression_spec.level == 0)
>>> +           pg_fatal("requested to compress the archive yet no level was specified");
>>>
>>> That was added at e9960732a.  
>>>
>>> But if you specify gzip:0, the compression level is already enforced by
>>> validate_compress_specification(), before hitting gzip.c:
>>>
>>> | pg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level
between1 and 9 (default at -1)
 
>>>
>>> 5e73a6048 intended that to work as before, and you *can* specify -Z0:
>>>
>>>     The change is backward-compatible, hence specifying only an integer
>>>     leads to no compression for a level of 0 and gzip compression when the
>>>     level is greater than 0.
>>>
>>>     $ time ./src/bin/pg_dump/pg_dump -h /tmp regression -t int8_tbl -Fp --compress 0 |file -
>>>     /dev/stdin: ASCII text
>>
>> FWIW I agree we should make this backwards-compatible - accept "0" and
>> treat it as no compression.
>>
>> Georgios, can you prepare a patch doing that?
> 
> I think maybe Tomas misunderstood.  What I was trying to say is that -Z
> 0 *is* accepted to mean no compression.  This part wasn't quoted, but I
> said:
> 

Ah, I see. Well, I also tried but with "-Z gzip:0" (and not -Z 0), and
that does fail:

  error: invalid compression specification: compression algorithm "gzip"
  expects a compression level between 1 and 9 (default at -1)

It's a bit weird these two cases behave differently, when both translate
to the same default compression method (gzip).

>> Right now, I think that pg_fatal in gzip.c is dead code - that was first
>> added in the patch version sent on 21 Dec 2022.
> 
> If you run the diff command that I've been talking about, you'll see
> that InitCompressorZlib was almost unchanged - e9960732 is essentially a
> refactoring.  I don't think it's desirable to add a pg_fatal() in a
> function that's otherwise nearly-unchanged.  The fact that it's
> nearly-unchanged is a good thing: it simplifies reading of what changed.
> If someone wants to add a pg_fatal() in that code path, it'd be better
> done in its own commit, with a separate message explaining the change.
> 
> If you insist on changing anything here, you might add an assertion (as
> you said earlier) along with a comment like
> /* -Z 0 uses the "None" compressor rather than zlib with no compression */
> 

Yeah, a comment would be helpful.

Also, after thinking about it a bit more maybe having the unreachable
pg_fatal() is not a good thing, as it will just confuse people (I'd
certainly assume having such check means there's a way in which case it
might trigger.). Maybe an assert would be better?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Pavel Borisov
Date:
Subject: Re: POC: Lock updated tuples in tuple_update() and tuple_delete()
Next
From: Ankit Kumar Pandey
Date:
Subject: Re: Sort optimizations: Making in-memory sort cache-aware