Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id c1493de3-7d97-c3c9-dd39-5a781e10a7b8@enterprisedb.com
Whole thread Raw
In response to Re: Add LZ4 compression in pg_dump  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: Add LZ4 compression in pg_dump  (gkokolatos@pm.me)
Re: Add LZ4 compression in pg_dump  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
Hi Justin,

Thanks for the patch.

On 3/8/23 02:45, Justin Pryzby wrote:
> On Wed, Mar 01, 2023 at 04:52:49PM +0100, Tomas Vondra wrote:
>> Thanks. That seems correct to me, but I find it somewhat confusing,
>> because we now have
>>
>>  DeflateCompressorInit vs. InitCompressorGzip
>>
>>  DeflateCompressorEnd vs. EndCompressorGzip
>>
>>  DeflateCompressorData - The name doesn't really say what it does (would
>>                          be better to have a verb in there, I think).
>>
>> I wonder if we can make this somehow clearer?
> 
> To move things along, I updated Georgios' patch:
> 
> Rename DeflateCompressorData() to DeflateCompressorCommon();

Hmmm, I don't find "common" any clearer than "data" :-( There needs to
at least be a comment explaining what "common" does.

> Rearrange functions to their original order allowing a cleaner diff to the prior code;

OK. I wasn't very enthusiastic about this initially, but after thinking
about it a bit I think it's meaningful to make diffs clearer. But I
don't see much difference with/without the patch. The

git diff --diff-algorithm=minimal -w
e9960732a~:src/bin/pg_dump/compress_io.c src/bin/pg_dump/compress_gzip.c

Produces ~25k diff with/without the patch. What am I doing wrong?

> Change pg_fatal() to an assertion+comment;

Yeah, that's reasonable. I'd even ditch the assert/comment, TBH. We
could add such protections against "impossible" stuff to a zillion other
places and the confusion likely outweighs the benefits.

> Update the commit message and fix a few typos;
> 

Thanks. I don't want to annoy you too much, but could you split the
patch into the "empty-data" fix and all the other changes (rearranging
functions etc.)? I'd rather not mix those in the same commit.


>> Also, InitCompressorGzip says this:
>>
>>    /*
>>     * If the caller has defined a write function, prepare the necessary
>>     * state. Avoid initializing during the first write call, because End
>>     * may be called without ever writing any data.
>>     */
>>     if (cs->writeF)
>>         DeflateCompressorInit(cs);
>>
>> Does it actually make sense to not have writeF defined in some cases?
> 
> InitCompressor is being called for either reading or writing, either of
> which could be null:
> 
> src/bin/pg_dump/pg_backup_custom.c:     ctx->cs = AllocateCompressor(AH->compression_spec,
> src/bin/pg_dump/pg_backup_custom.c-                                                              NULL,
> src/bin/pg_dump/pg_backup_custom.c-                                                              _CustomWriteFunc);
> --
> src/bin/pg_dump/pg_backup_custom.c:     cs = AllocateCompressor(AH->compression_spec,
> src/bin/pg_dump/pg_backup_custom.c-                                                     _CustomReadFunc, NULL);
> 
> It's confusing that the comment says "Avoid initializing...".  What it
> really means is "Initialize eagerly...".  But that makes more sense in
> the context of the commit message for this bugfix than in a comment.  So
> I changed that too.
> 
> +       /* If deflation was initialized, finalize it */
                      
 
> +       if (cs->private_data)
                      
 
> +               DeflateCompressorEnd(AH, cs);
                      
 
> 
> Maybe it'd be more clear if this used "if (cs->writeF)", like in the
> init function ?
> 

Yeah, if the two checks are equivalent, it'd be better to stick to the
same check everywhere.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: meson: Non-feature feature options
Next
From: "Regina Obe"
Date:
Subject: RE: Ability to reference other extensions by schema in extension scripts