Re: [HACKERS] Proposal: generic WAL compression - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: [HACKERS] Proposal: generic WAL compression
Date
Msg-id 23990.1510853483@localhost
Whole thread Raw
In response to [HACKERS] Proposal: generic WAL compression  (Oleg Ivanov <o.ivanov@postgrespro.ru>)
Responses Re: [HACKERS] Proposal: generic WAL compression
Re: [HACKERS] Proposal: generic WAL compression
Re: [HACKERS] Proposal: generic WAL compression
List pgsql-hackers
Oleg Ivanov <o.ivanov@postgrespro.ru> wrote:

> In order to overcome that issue, I would like to propose the patch, which
> provides possibility to use another approach of the WAL record
> construction.

After having spent several hours reviewing this patch I dare to send the
following comments:


* writeToPtr() and readFromPtr() are applied to the existing code. I think this is a reason for a separate diff, to be
appliedbefore the actual patch. 

BTW, if you're going to do any refactoring of the existing code, I think the
"Begin" and "Start" words should be used consistently, see "fragmentBegin" vs
"validStart" in computeRegionBaseDelta().


* What's the reason for changing FRAGMENT_HEADER_SIZE ? I see no change in the data layout that writeFragment()
produces.


* alignRegions()

** typo in the prologue: "mismathing".

** "An O(ND) Difference Algorithm and Its Variations" - I think the reference  should contain more information, e.g.
nameof the author(s). I think I even  saw URLs to scientific papers in the PG source but I'm not 100% sure right  now. 

** As for the algorithm itself: Although I didn't have enough patience (and  time) to follow it in every detail for
thisfirst review, I think I can  imagine what it is about. Yet I think reading would be easier if the  concepts of
alignmentand "diff delta" were depicted in the comments,  perhaps using some sort of "ASCII graphics". 

** DiffDeltaOperations enumeration: the individual values should be described.


* computeRegionDiffDelta()

** Does "previous delta" in one of the comments refer to "base delta",  i.e. the delta computation w/o your patch? If
so,the comment should  mention it explicitly. 

** If you use Min() to initialize the for-loop, it'd be more consistent if you  also used Max() when checking the
limits:

for (i = Min(validStart, targetStart); i < Max(validEnd, targetEnd); ++i)

And similarly you can simplify the body of the loop.


* computeDelta()

As the expresssion

*((bool *) pageData->delta)

is used more than once, I think a separate (bool *) variable should be used.


--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] parallelize queries containing initplans
Next
From: Paul Ramsey
Date:
Subject: Re: Inlining functions with "expensive" parameters