Re: refactoring basebackup.c - Mailing list pgsql-hackers

From Robert Haas
Subject Re: refactoring basebackup.c
Date
Msg-id CA+TgmoaDEBWWxvyL0RZq8kJ9qu1fpJjqn=e7YK3SUB5UBnmTVQ@mail.gmail.com
Whole thread Raw
In response to Re: refactoring basebackup.c  (Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>)
Responses Re: refactoring basebackup.c
Re: refactoring basebackup.c
List pgsql-hackers
On Wed, Sep 8, 2021 at 2:14 PM Jeevan Ladhe
<jeevan.ladhe@enterprisedb.com> wrote:
> To give an example, I put some logging statements, and I can see in the log:
> "
> bytes remaining in mysink->base.bbs_next->bbs_buffer: 16537
> input size to be compressed: 512
> estimated size for compressed buffer by LZ4F_compressBound(): 262667
> actual compressed size: 16
> "

That is pretty lame. I don't know why it needs a ~256k buffer to
produce 16 bytes of output.

The way the gzip APIs I used work, you tell it how big the output
buffer is and it writes until it fills that buffer, or until the input
buffer is empty, whichever happens first. But this seems to be the
other way around: you tell it how much input you have, and it tells
you how big a buffer it needs. To handle that elegantly, I think I
need to make some changes to the design of the bbsink stuff. What I'm
thinking is that each bbsink somehow tells the next bbsink how big to
make the buffer. So if the LZ4 buffer is told that its buffer should
be at least, I don't know, say 64kB. Then it can compute how large an
output buffer the LZ4 library requires for 64kB. Hopefully we can
assume that liblz4 never needs a smaller buffer for a larger input.
Then we can assume that if a 64kB input requires, say, a 300kB output
buffer, every possible input < 64kB also requires an output buffer <=
300 kB.

But we can't just say, well, we were asked to create a 64kB buffer (or
whatever) so let's ask the next bbsink for a 300kB buffer (or
whatever), because then as soon as we write any data at all into it
the remaining buffer space might be insufficient for the next chunk.
So instead what I think we should do is have bbsink_lz4 set the size
of the next sink's buffer to its own buffer size +
LZ4F_compressBound(its own buffer size). So in this example if it's
asked to create a 64kB buffer and LZ4F_compressBound(64kB) = 300kB
then it asks the next sink to set the buffer size to 364kB. Now, that
means that there will always be at least 300 kB available in the
output buffer until we've accumulated a minimum of 64 kB of compressed
data, and then at that point we can flush.

I think this would be relatively clean and would avoid the need for
the double copying that the current design forced you to do. What do
you think?

+ /*
+ * If we do not have enough space left in the output buffer for this
+ * chunk to be written, first archive the already written contents.
+ */
+ if (nextChunkLen > mysink->base.bbs_next->bbs_buffer_length -
mysink->bytes_written ||
+ mysink->bytes_written >= mysink->base.bbs_next->bbs_buffer_length)
+ {
+ bbsink_archive_contents(sink->bbs_next, mysink->bytes_written);
+ mysink->bytes_written = 0;
+ }

I think this is flat-out wrong. It assumes that the compressor will
never generate more than N bytes of output given N bytes of input,
which is not true. Not sure there's much point in fixing it now
because with the changes described above this code will have to change
anyway, but I think it's just lucky that this has worked for you in
your testing.

+ /*
+ * LZ4F_compressUpdate() returns the number of bytes written into output
+ * buffer. We need to keep track of how many bytes have been cumulatively
+ * written into the output buffer(bytes_written). But,
+ * LZ4F_compressUpdate() returns 0 in case the data is buffered and not
+ * written to output buffer, set autoFlush to 1 to force the writing to the
+ * output buffer.
+ */
+ prefs->autoFlush = 1;

I don't see why this should be necessary. Elsewhere you have code that
caters to bytes being stuck inside LZ4's buffer, so why do we also
require this?

Thanks for researching this!

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [PATCH] test/ssl: rework the sslfiles Makefile target
Next
From: Andrew Dunstan
Date:
Subject: Re: Why does bootstrap and later initdb stages happen via client?