Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Add LZ4 compression in pg_dump
Date
Msg-id fc130e92-09ea-c5b0-a1a0-29f93c323371@enterprisedb.com
Whole thread Raw
In response to Re: Add LZ4 compression in pg_dump  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: Add LZ4 compression in pg_dump
Re: Add LZ4 compression in pg_dump
List pgsql-hackers

On 3/16/23 01:20, Justin Pryzby wrote:
> On Mon, Mar 13, 2023 at 10:47:12PM +0100, Tomas Vondra wrote:
>>> Rearrange functions to their original order allowing a cleaner diff to the prior code;
>>
>> OK. I wasn't very enthusiastic about this initially, but after thinking
>> about it a bit I think it's meaningful to make diffs clearer. But I
>> don't see much difference with/without the patch. The
>>
>> git diff --diff-algorithm=minimal -w e9960732a~:src/bin/pg_dump/compress_io.c src/bin/pg_dump/compress_gzip.c
>>
>> Produces ~25k diff with/without the patch. What am I doing wrong?
> 
> Do you mean 25 kB of diff ?

Yes, if you redirect the git-diff to a file, it's a 25kB file.

> I agree that the statistics of the diff output don't change a lot:
> 
>   1 file changed, 201 insertions(+), 570 deletions(-)
>   1 file changed, 198 insertions(+), 548 deletions(-)
> 
> But try reading the diff while looking for the cause of a bug.  It's the
> difference between reading 50, two-line changes, and reading a hunk that
> replaces 100 lines with a different 100 lines, with empty/unrelated
> lines randomly thrown in as context.
> 
> When the diff is readable, the pg_fatal() also stands out.
> 

I don't know, maybe I'm doing something wrong or maybe I just am bad at
looking at diffs, but if I apply the patch you submitted on 8/3 and do
the git-diff above (output attached), it seems pretty incomprehensible
to me :-( I don't see 50 two-line changes (I certainly wouldn't be able
to identify the root cause of the bug based on that).

>>> Change pg_fatal() to an assertion+comment;
>>
>> Yeah, that's reasonable. I'd even ditch the assert/comment, TBH. We
>> could add such protections against "impossible" stuff to a zillion other
>> places and the confusion likely outweighs the benefits.
>>
>>> Update the commit message and fix a few typos;
>>
>> Thanks. I don't want to annoy you too much, but could you split the
>> patch into the "empty-data" fix and all the other changes (rearranging
>> functions etc.)? I'd rather not mix those in the same commit.
> 
> I don't know if that makes sense?  The "empty-data" fix creates a new
> function called DeflateCompressorInit().  My proposal was to add the new
> function in the same place in the file as it used to be.
> 

Got it. In that case I agree it's fine to do that in a single commit.

> The patch also moves the pg_fatal() that's being removed.  I don't think
> it's going to look any cleaner to read a history involving the
> pg_fatal() first being added, then moved, then removed.  Anyway, I'll
> wait while the community continues discussion about the pg_fatal().
> 

I think the agreement was to replace the pg_fatal with and assert, and I
see your patch already does that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: A problem about ParamPathInfo for an AppendPath
Next
From: Justin Pryzby
Date:
Subject: Re: Add LZ4 compression in pg_dump