Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Rahila Syed
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id CAH2L28tvW6VEB9tfQfWHoDqF21TsczSA7R2gX9U=0wk3k+9dQA@mail.gmail.com
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes  (Fujii Masao <masao.fujii@gmail.com>)
Re: [REVIEW] Re: Compression of full-page-writes  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
>Isn't it better to allocate the memory for compression_scratch in
>InitXLogInsert()
>like hdr_scratch?

I think making compression_scratch a statically allocated global variable  is the result of  following discussion earlier,



Thank you,
Rahila Syed



On Thu, Dec 18, 2014 at 1:57 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Thu, Dec 18, 2014 at 2:21 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>
>
> On Wed, Dec 17, 2014 at 11:33 PM, Rahila Syed <rahilasyed90@gmail.com>
> wrote:
>>
>> I had a look at code. I have few minor points,
>
> Thanks!
>
>> +           bkpb.fork_flags |= BKPBLOCK_HAS_IMAGE;
>> +
>> +           if (is_compressed)
>>             {
>> -               rdt_datas_last->data = page;
>> -               rdt_datas_last->len = BLCKSZ;
>> +               /* compressed block information */
>> +               bimg.length = compress_len;
>> +               bimg.extra_data = hole_offset;
>> +               bimg.extra_data |= XLR_BLCK_COMPRESSED_MASK;
>>
>> For consistency with the existing code , how about renaming the macro
>> XLR_BLCK_COMPRESSED_MASK as BKPBLOCK_HAS_COMPRESSED_IMAGE on the lines of
>> BKPBLOCK_HAS_IMAGE.
>
> OK, why not...
>
>>
>> +               blk->hole_offset = extra_data & ~XLR_BLCK_COMPRESSED_MASK;
>> Here , I think that having the mask as BKPBLOCK_HOLE_OFFSET_MASK will be
>> more indicative of the fact that lower 15 bits of extra_data field comprises
>> of hole_offset value. This suggestion is also just to achieve consistency
>> with the existing BKPBLOCK_FORK_MASK for fork_flags field.
>
> Yeah that seems clearer, let's define it as ~XLR_BLCK_COMPRESSED_MASK
> though.
>
>> And comment typo
>> +            * First try to compress block, filling in the page hole with
>> zeros
>> +            * to improve the compression of the whole. If the block is
>> considered
>> +            * as incompressible, complete the block header information as
>> if
>> +            * nothing happened.
>>
>> As hole is no longer being compressed, this needs to be changed.
>
> Fixed. As well as an additional comment block down.
>
> A couple of things noticed on the fly:
> - Fixed pg_xlogdump being not completely correct to report the FPW
> information
> - A couple of typos and malformed sentences fixed
> - Added an assertion to check that the hole offset value does not the bit
> used for compression status
> - Reworked docs, mentioning as well that wal_compression is off by default.
> - Removed stuff in pg_controldata and XLOG_PARAMETER_CHANGE (mentioned by
> Fujii-san)

Thanks!

+                else
+                    memcpy(compression_scratch, page, page_len);

I don't think the block image needs to be copied to scratch buffer here.
We can try to compress the "page" directly.

+#include "utils/pg_lzcompress.h"
 #include "utils/memutils.h"

pg_lzcompress.h should be after meutils.h.

+/* Scratch buffer used to store block image to-be-compressed */
+static char compression_scratch[PGLZ_MAX_BLCKSZ];

Isn't it better to allocate the memory for compression_scratch in
InitXLogInsert()
like hdr_scratch?

+        uncompressed_page = (char *) palloc(PGLZ_RAW_SIZE(header));

Why don't we allocate the buffer for uncompressed page only once and
keep reusing it like XLogReaderState->readBuf? The size of uncompressed
page is at most BLCKSZ, so we can allocate the memory for it even before
knowing the real size of each block image.

-                printf(" (FPW); hole: offset: %u, length: %u\n",
-                       record->blocks[block_id].hole_offset,
-                       record->blocks[block_id].hole_length);
+                if (record->blocks[block_id].is_compressed)
+                    printf(" (FPW); hole offset: %u, compressed length %u\n",
+                           record->blocks[block_id].hole_offset,
+                           record->blocks[block_id].bkp_len);
+                else
+                    printf(" (FPW); hole offset: %u, length: %u\n",
+                           record->blocks[block_id].hole_offset,
+                           record->blocks[block_id].bkp_len);

We need to consider what info about FPW we want pg_xlogdump to report.
I'd like to calculate how much bytes FPW was compressed, from the report
of pg_xlogdump. So I'd like to see also the both length of uncompressed FPW
and that of compressed one in the report.

In pg_config.h, the comment of BLCKSZ needs to be updated? Because
the maximum size of BLCKSZ can be affected by not only itemid but also
XLogRecordBlockImageHeader.

     bool        has_image;
+    bool        is_compressed;

Doesn't ResetDecoder need to reset is_compressed?

+#wal_compression = off            # enable compression of full-page writes

Currently wal_compression compresses only FPW, so isn't it better to place
it after full_page_writes in postgresql.conf.sample?

+    uint16        extra_data;    /* used to store offset of bytes in
"hole", with
+                             * last free bit used to check if block is
+                             * compressed */

At least to me, defining something like the following seems more easy to
read.

    uint16    hole_offset:15,
                    is_compressed:1

Regards,

--
Fujii Masao

pgsql-hackers by date:

Previous
From: Torsten Zuehlsdorff
Date:
Subject: Re: Commitfest problems
Next
From: Fujii Masao
Date:
Subject: Re: Streaming replication and WAL archive interactions