Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id CAB7nPqSftFwNpP7-+5ZxYPGHy7jXt35KLP_6FKPfB84OhHUE0A@mail.gmail.com
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers


On Mon, Feb 23, 2015 at 9:22 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Mon, Feb 23, 2015 at 5:28 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:
> Hello,
>
> Attached is a patch which has following changes,
>
> As suggested above block ID in xlog structs has been replaced by chunk ID.
> Chunk ID is used to distinguish between different types of xlog record
> fragments.
> Like,
> XLR_CHUNK_ID_DATA_SHORT
> XLR_CHUNK_ID_DATA_LONG
> XLR_CHUNK_BKP_COMPRESSED
> XLR_CHUNK_BKP_WITH_HOLE
>
> In block references, block ID follows the chunk ID. Here block ID retains
> its functionality.
> This approach increases data by 1 byte for each block reference in an xlog
> record. This approach separates ID referring different fragments of xlog
> record from the actual block ID which is used to refer  block references in
> xlog record.

I've not read this logic yet, but ISTM there is a bug in that new WAL format
because I got the following error and the startup process could not replay
any WAL records when I set up replication and enabled wal_compression.

LOG:  record with invalid length at 0/30000B0
LOG:  record with invalid length at 0/3000518
LOG:  Invalid block length in record 0/30005A0
LOG:  Invalid block length in record 0/3000D60

Looking at this code, I think that it is really confusing to move the data related to the status of the backup block out of XLogRecordBlockImageHeader to the chunk ID itself that may *not* include a backup block at all as it is conditioned by the presence of BKPBLOCK_HAS_IMAGE. I would still prefer the idea of having the backup block data in its dedicated header with bits stolen from the existing fields, perhaps by rewriting it to something like that:
typedef struct XLogRecordBlockImageHeader {
uint32 length:15,
     hole_length:15,
     is_compressed:1,
     is_hole:1;
} XLogRecordBlockImageHeader;
Now perhaps I am missing something and this is really "ugly" ;)

+#define XLR_CHUNK_ID_DATA_SHORT                255
+#define XLR_CHUNK_ID_DATA_LONG         254
+#define XLR_CHUNK_BKP_COMPRESSED       0x01
+#define XLR_CHUNK_BKP_WITH_HOLE                0x02
Wouldn't we need a XLR_CHUNK_ID_BKP_HEADER or equivalent? The idea between this chunk_id stuff it to be able to make the difference between a short header, a long header and a backup block header by looking at the first byte.

The comments on top of XLogRecordBlockImageHeader are still mentioning the old parameters like with_hole or is_compressed that you have removed.

It seems as well that there is some noise:
-   lp_off and lp_len fields in ItemIdData (see include/storage/itemid.h).
+   lp_off and lp_len fields in ItemIdData (see include/storage/itemid.h)
--
Michael

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_dump gets attributes from tables in extensions
Next
From: Michael Paquier
Date:
Subject: Re: Enforce creation of destination folders for source files in pg_regress (Was: pg_regress writes into source tree)