Re: refactoring basebackup.c - Mailing list pgsql-hackers

From Jeevan Ladhe
Subject Re: refactoring basebackup.c
Date
Msg-id CAOgcT0OBD+GU4ZtY3wzzQng8ffjTP_WhXGnfUzrSN4vxDnzxmw@mail.gmail.com
Whole thread Raw
In response to Re: refactoring basebackup.c  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi Robert,

I mean #define CHUNK_SIZE is still in the patch.

Oops, removed now. 
 
> > With just that change, you can set
> > next_buf_len = LZ4F_HEADER_SIZE_MAX + mysink->output_buffer_bound --
> > but that's also more than you need. You can instead do next_buf_len =
> > Min(LZ4F_HEADER_SIZE_MAX, mysink->output_buffer_bound). Now, you're
> > probably thinking that won't work, because bbsink_lz4_begin_archive()
> > could fill up the buffer partway, and then the first call to
> > bbsink_lz4_archive_contents() could overrun it. But that problem can
> > be solved by reversing the order of operations in
> > bbsink_lz4_archive_contents(): before you call LZ4F_compressUpdate(),
> > test whether you need to empty the buffer first, and if so, do it.
>
> I am still not able to get - how can we survive with a mere
> size of Min(LZ4F_HEADER_SIZE_MAX, mysink->output_buffer_bound).
> LZ4F_HEADER_SIZE_MAX is defined as 19 in lz4 library. With this
> proposal, it is almost guaranteed that the next buffer length will
> be always set to 19, which will result in failure of a call to
> LZ4F_compressUpdate() with the error LZ4F_ERROR_dstMaxSize_tooSmall,
> even if we had called bbsink_archive_contents() before.

Sorry, should have been Max(), not Min().

Ahh ok.
I looked into the code of LZ4F_compressBound() and the header size is
already included in the calculations, so when we compare
LZ4F_HEADER_SIZE_MAX and mysink->output_buffer_bound, the latter
will be always greater, and hence sufficient length for the output buffer. I
made this change in the attached patch, and also cleared the buffer by
calling bbsink_archive_contents() before calling LZ4_compressUpdate().
Similarly cleared the buffer before calling LZ4F_compressEnd().

> You mean the way gzip allows us to use our own alloc and free functions
> by means of providing the function pointers for them. Unfortunately,
> no, LZ4 does not have that kind of provision. Maybe that makes a
> good proposal for LZ4 library ;-).
> I cannot think of another solution to it right away.

OK. Will give it some thought.

I have started a thread[1] on LZ4 community for this, but so far no
reply on that.

Regards,
Jeevan Ladhe

Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Postgres perl module namespace
Next
From: Ronan Dunklau
Date:
Subject: Re: pg_receivewal starting position