Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id CAHGQGwFbM2fiBMq0L0SdJRNd2zh=7ofQ6F4DsQPK6_QNfuxB1A@mail.gmail.com
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Fri, Dec 19, 2014 at 12:19 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Dec 18, 2014 at 5:27 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> Thanks!
> Thanks for your input.
>
>> +                else
>> +                    memcpy(compression_scratch, page, page_len);
>>
>> I don't think the block image needs to be copied to scratch buffer here.
>> We can try to compress the "page" directly.
> Check.
>
>> +#include "utils/pg_lzcompress.h"
>>  #include "utils/memutils.h"
>>
>> pg_lzcompress.h should be after meutils.h.
> Oops.
>
>> +/* Scratch buffer used to store block image to-be-compressed */
>> +static char compression_scratch[PGLZ_MAX_BLCKSZ];
>>
>> Isn't it better to allocate the memory for compression_scratch in
>> InitXLogInsert()
>> like hdr_scratch?
> Because the OS would not touch it if wal_compression is never used,
> but now that you mention it, it may be better to get that in the
> context of xlog_insert..
>
>> +        uncompressed_page = (char *) palloc(PGLZ_RAW_SIZE(header));
>>
>> Why don't we allocate the buffer for uncompressed page only once and
>> keep reusing it like XLogReaderState->readBuf? The size of uncompressed
>> page is at most BLCKSZ, so we can allocate the memory for it even before
>> knowing the real size of each block image.
> OK, this would save some cycles. I was trying to make process allocate
> a minimum of memory only when necessary.
>
>> -                printf(" (FPW); hole: offset: %u, length: %u\n",
>> -                       record->blocks[block_id].hole_offset,
>> -                       record->blocks[block_id].hole_length);
>> +                if (record->blocks[block_id].is_compressed)
>> +                    printf(" (FPW); hole offset: %u, compressed length %u\n",
>> +                           record->blocks[block_id].hole_offset,
>> +                           record->blocks[block_id].bkp_len);
>> +                else
>> +                    printf(" (FPW); hole offset: %u, length: %u\n",
>> +                           record->blocks[block_id].hole_offset,
>> +                           record->blocks[block_id].bkp_len);
>>
>> We need to consider what info about FPW we want pg_xlogdump to report.
>> I'd like to calculate how much bytes FPW was compressed, from the report
>> of pg_xlogdump. So I'd like to see also the both length of uncompressed FPW
>> and that of compressed one in the report.
> OK, so let's add a parameter in the decoder for the uncompressed
> length. Sounds fine?
>
>> In pg_config.h, the comment of BLCKSZ needs to be updated? Because
>> the maximum size of BLCKSZ can be affected by not only itemid but also
>> XLogRecordBlockImageHeader.
> Check.
>
>>      bool        has_image;
>> +    bool        is_compressed;
>>
>> Doesn't ResetDecoder need to reset is_compressed?
> Check.
>
>> +#wal_compression = off            # enable compression of full-page writes
>> Currently wal_compression compresses only FPW, so isn't it better to place
>> it after full_page_writes in postgresql.conf.sample?
> Check.
>
>> +    uint16        extra_data;    /* used to store offset of bytes in
>> "hole", with
>> +                             * last free bit used to check if block is
>> +                             * compressed */
>> At least to me, defining something like the following seems more easy to
>> read.
>>     uint16    hole_offset:15,
>>                     is_compressed:1
> Check++.
>
> Updated patches addressing all those things are attached.

Thanks for updating the patch!

Firstly I'm thinking to commit the
0001-Move-pg_lzcompress.c-to-src-common.patch.

pg_lzcompress.h still exists in include/utils, but it should be moved to
include/common?

Do we really need PGLZ_Status? I'm not sure whether your categorization of
the result status of compress/decompress functions is right or not. For example,
pglz_decompress() can return PGLZ_INCOMPRESSIBLE status, but which seems
invalid logically... Maybe this needs to be revisited when we introduce other
compression algorithms and create the wrapper function for those compression
and decompression functions. Anyway making pg_lzdecompress return
the boolean value seems enough.

I updated 0001-Move-pg_lzcompress.c-to-src-common.patch accordingly.
Barring objections, I will push the attached patch firstly.

Regards,

--
Fujii Masao

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: replicating DROP commands across servers
Next
From: Michael Paquier
Date:
Subject: Re: [REVIEW] Re: Compression of full-page-writes