Re: Teach pg_receivewal to use lz4 compression - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Teach pg_receivewal to use lz4 compression
Date
Msg-id YXvdOb6yo/xxfVce@paquier.xyz
Whole thread Raw
In response to Re: Teach pg_receivewal to use lz4 compression  (gkokolatos@pm.me)
Responses Re: Teach pg_receivewal to use lz4 compression
List pgsql-hackers
On Fri, Oct 29, 2021 at 09:45:41AM +0000, gkokolatos@pm.me wrote:
> On Saturday, September 18th, 2021 at 8:18 AM, Michael Paquier <michael@paquier.xyz> wrote:
>> We don't really care about contentSize as long as a segment is not
>> completed.  Rather than filling contentSize all the time we write
>> something, we'd better update frameInfo once the segment is
>> completed and closed.  That would also take take of the error as this
>> is not checked if contentSize is 0.  It seems to me that we should
>> fill in the information when doing a CLOSE_NORMAL.
>
> Thank you for the comment. I think that the opposite should be done. At the time
> that the file is closed, the header is already written to disk. We have no way
> to know that is not. If we need to go back to refill the information, we will
> have to ask for the API to produce a new header. There is little guarantee that
> the header size will be the same and as a consequence we will have to shift
> the actual data around.

Why would the header size change between the moment the segment is
begun and it is finished?  We could store it in memory and write it
again when the segment is closed instead, even if it means to fseek()
back to the beginning of the file once the segment is completed.
Storing WalSegSz from the moment a segment is opened makes the code
weaker to SIGINTs and the kind, so this does not fix the problem I
mentioned previously :/

> In the attached, the header is rewritten only when closing an incomplete
> segment. For all intents and purposes that segment is not usable. However there
> might be custom scripts that might want to attempt to parse even an otherwise
> unusable file.
>
> A different and easier approach would be to simply prepare the LZ4 context for
> future actions and simply ignore the file.

I am not sure what you mean by "ignore" here.  Do you mean to store 0
in contentSize when opening the segment and rewriting again the header
once the segment is completed?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: ThisTimeLineID is used uninitialized in basebackup.c, too
Next
From: Nitin Jadhav
Date:
Subject: Re: when the startup process doesn't (logging startup delays)