Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id CAB7nPqRFq__=QF0kgzO2kaJshSw8hDV2rw27cMGN=f8XBPUNTA@mail.gmail.com
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers


On Sun, Nov 9, 2014 at 10:32 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Sun, Nov 9, 2014 at 6:41 AM, Rahila Syed <rahilasyed90@gmail.com> wrote:
>> Hello,
>>
>>>The patch was not applied to the master cleanly. Could you update the
>>> patch?
>> Please find attached updated and rebased patch to compress FPW. Review
>> comments given above have been implemented.
>
> Thanks for updating the patch! Will review it.
>
> BTW, I got the following compiler warnings.
>
> xlogreader.c:755: warning: assignment from incompatible pointer type
> autovacuum.c:1412: warning: implicit declaration of function
> 'CompressBackupBlocksPagesAlloc'
> xlogreader.c:755: warning: assignment from incompatible pointer type
I have been looking at this patch, here are some comments:
1) This documentation change is incorrect:
-      <term><varname>full_page_writes</varname> (<type>boolean</type>)
+      <term><varname>full_page_writes</varname> (<type>enum</type>)</term>
       <indexterm>
        <primary><varname>full_page_writes</> configuration parameter</primary>
       </indexterm>
-      </term>
The termination of block term was correctly places before.
2) This patch defines FullPageWritesStr and full_page_writes_str, but both do more or less the same thing.
3) This patch is touching worker_spi.c and calling CompressBackupBlocksPagesAlloc directly. Why is that necessary? Doesn't a bgworker call InitXLOGAccess once it connects to a database?
4) Be careful as well of whitespaces (code lines should have as well a maximum of 80 characters):
+     * If compression is set on replace the rdata nodes of backup blocks added in the loop
+     * above by single rdata node that contains compressed backup blocks and their headers
+     * except the header of first block which is used to store the information about compression.
+     */
5) GetFullPageWriteGUC or something similar is necessary, but I think that for consistency with doPageWrites its value should be fetched in XLogInsert and then passed as an extra argument in XLogRecordAssemble. Thinking more about this, I think that it would be cleaner to simply have a bool flag tracking if compression is active or not, something like doPageCompression, that could be fetched using GetFullPageWriteInfo. Thinking more about it, we could directly track forcePageWrites and fullPageWrites, but that would make back-patching more difficult with not that much gain.
6) Not really a complaint, but note that this patch is using two bits that were unused up to now to store the compression status of a backup block. This is actually safe as long as the maximum page is not higher than 32k, which is the limit authorized by --with-blocksize btw. I think that this deserves a comment at the top of the declaration of BkpBlock.
!       unsigned        hole_offset:15, /* number of bytes before "hole" */
!                               flags:2,                /* state of a backup block, see below */
!                               hole_length:15; /* number of bytes in "hole" */
7) Some code in RestoreBackupBlock:
+       char *uncompressedPages;
+
+       uncompressedPages = (char *)palloc(XLR_TOTAL_BLCKSZ);
[...]
+       /* Check if blocks in WAL record are compressed */
+       if (bkpb.flag_compress == BKPBLOCKS_COMPRESSED)
+       {
+               /* Checks to see if decompression is successful is made inside the function */
+               pglz_decompress((PGLZ_Header *) blk, uncompressedPages);
+               blk = uncompressedPages;
+       }
uncompressedPages is pallocd'd all the time but you actually just need to do that when the block is compressed.
8) Arf, I don't like much the logic around CompressBackupBlocksPagesAlloc using a malloc to allocate once the space necessary for compressed and uncompressed pages. You are right to not do that inside a critical section, but PG tries to maximize the allocations to be palloc'd. Now it is true that if a palloc does not succeed, PG always ERROR's out (writer adding entry to TODO list)... Hence I think that using a static variable for those compressed and uncompressed pages makes more sense, and this simplifies greatly the patch as well.
9) Is avw_sighup_handler really necessary, what's wrong in allocating it all the time by default? This avoids some potential caveats in error handling as well as in value updates for full_page_writes.

So, note that I am not only complaining about the patch, I actually rewrote it as attached while reviewing, with additional minor cleanups and enhancements. I did as well a couple of tests like the script attached, compression numbers being more or less the same as your previous patch, some noise creating differences. I have done also some regression test runs with a standby replaying behind.

I'll go through the patch once again a bit later, but feel free to comment.
Regards,
--
Michael
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [v9.5] Custom Plan API
Next
From:
Date:
Subject: Re: pg_receivexlog --status-interval add fsync feedback