Re: refactoring basebackup.c - Mailing list pgsql-hackers

From Robert Haas
Subject Re: refactoring basebackup.c
Date
Msg-id CA+TgmoYt_v4RUMCQK8dXP7yFezaecFjZw9XDq1i+F=6YnnFxGg@mail.gmail.com
Whole thread Raw
In response to Re: refactoring basebackup.c  (Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>)
Responses Re: refactoring basebackup.c
List pgsql-hackers
On Mon, Sep 13, 2021 at 6:03 AM Jeevan Ladhe
<jeevan.ladhe@enterprisedb.com> wrote:
>> + /*
>> + * If we do not have enough space left in the output buffer for this
>> + * chunk to be written, first archive the already written contents.
>> + */
>> + if (nextChunkLen > mysink->base.bbs_next->bbs_buffer_length -
>> mysink->bytes_written ||
>> + mysink->bytes_written >= mysink->base.bbs_next->bbs_buffer_length)
>> + {
>> + bbsink_archive_contents(sink->bbs_next, mysink->bytes_written);
>> + mysink->bytes_written = 0;
>> + }
>>
>> I think this is flat-out wrong. It assumes that the compressor will
>> never generate more than N bytes of output given N bytes of input,
>> which is not true. Not sure there's much point in fixing it now
>> because with the changes described above this code will have to change
>> anyway, but I think it's just lucky that this has worked for you in
>> your testing.
>
> I see your point. But for it to be accurate, I think we need to then
> considered the return value of LZ4F_compressBound() to check if that
> many bytes are available. But, as explained earlier our output buffer is
> already way smaller than that.

Well, in your last version of the patch, you kind of had two output
buffers: a bigger one that you use internally and then the "official"
one which is associated with the next sink. With my latest patch set
you should be able to make that go away by just arranging for the next
sink's buffer to be as big as you need it to be. But, if we were going
to stick with using an extra buffer, then the solution would not be to
do this, but to copy the internal buffer to the official buffer in
multiple chunks if needed. So don't bother doing this here but just
wait and see how much data you get and then chunk it to the next
sink's buffer, calling bbsink_archive_contents() multiple times if
required. That would be annoying and expensive so I'm glad we're not
doing it that way, but it could be done correctly.

>> + /*
>> + * LZ4F_compressUpdate() returns the number of bytes written into output
>> + * buffer. We need to keep track of how many bytes have been cumulatively
>> + * written into the output buffer(bytes_written). But,
>> + * LZ4F_compressUpdate() returns 0 in case the data is buffered and not
>> + * written to output buffer, set autoFlush to 1 to force the writing to the
>> + * output buffer.
>> + */
>> + prefs->autoFlush = 1;
>>
>> I don't see why this should be necessary. Elsewhere you have code that
>> caters to bytes being stuck inside LZ4's buffer, so why do we also
>> require this?
>
> This is needed to know the actual bytes written in the output buffer. If it is
> set to 0, then LZ4F_compressUpdate() would randomly return 0 or actual
> bytes are written to the output buffer, depending on whether it has buffered
> or really flushed data to the output buffer.

The problem is that if we autoflush, I think it will cause the
compression ratio to be less good. Try un-lz4ing a file that is
produced this way and then re-lz4 it and compare the size of the
re-lz4'd file to the original one. Compressors rely on postponing
decisions about how to compress until they've seen as much of the
input as possible, and flushing forces them to decide earlier, and
maybe making a decision that isn't as good as it could have been. So I
believe we should look for a way of avoiding this. Now I realize
there's a problem there with doing that and also making sure the
output buffer is large enough, and I'm not quite sure how we solve
that problem, but there is probably a way to do it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Estimating HugePages Requirements?
Next
From: Robert Haas
Date:
Subject: Re: refactoring basebackup.c