[REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Abhijit Menon-Sen
Subject [REVIEW] Re: Compression of full-page-writes
Date
Msg-id 20140617114713.GA20427@toroid.org
Whole thread Raw
In response to Re: Compression of full-page-writes  (Rahila Syed <rahilasyed90@gmail.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes  (Claudio Freire <klaussfreire@gmail.com>)
Re: [REVIEW] Re: Compression of full-page-writes  (Rahila Syed <rahilasyed90@gmail.com>)
Re: [REVIEW] Re: Compression of full-page-writes  (Rahila Syed <rahilasyed90@gmail.com>)
List pgsql-hackers
At 2014-06-13 20:07:29 +0530, rahilasyed90@gmail.com wrote:
>
> Patch named Support-for-lz4-and-snappy adds support for LZ4 and Snappy
> in PostgreSQL.

I haven't looked at this in any detail yet, but I note that the patch
creates src/common/lz4/.travis.yml, which it shouldn't.

I have a few preliminary comments about your patch.

> @@ -84,6 +87,7 @@ bool        XLogArchiveMode = false;
>  char       *XLogArchiveCommand = NULL;
>  bool        EnableHotStandby = false;
>  bool        fullPageWrites = true;
> +int            compress_backup_block = false;

I think compress_backup_block should be initialised to
BACKUP_BLOCK_COMPRESSION_OFF. (But see below.)

> +            for (j = 0; j < XLR_MAX_BKP_BLOCKS; j++)
> +                compressed_pages[j] = (char *) malloc(buffer_size);

Shouldn't this use palloc?

> + * Create a compressed version of a backup block
> + *
> + * If successful, return a compressed result and set 'len' to its length.
> + * Otherwise (ie, compressed result is actually bigger than original),
> + * return NULL.
> + */
> +static char *
> +CompressBackupBlock(char *page, uint32 orig_len, char *dest, uint32 *len)
> +{

First, the calling convention is a bit strange. I understand that you're
pre-allocating compressed_pages[] so as to avoid repeated allocations;
and that you're doing it outside CompressBackupBlock so as to avoid
passing in the index i. But the result is a little weird.

At the very minimum, I would move the "if (!compressed_pages_allocated)"
block outside the "for (i = 0; i < XLR_MAX_BKP_BLOCKS; i++)" loop, and
add some comments. I think we could live with that.

But I'm not at all fond of the code in this function either. I'd write
it like this:
   struct varlena *buf = (struct varlena *) dest;
   if (compress_backup_block = BACKUP_BLOCK_COMPRESSION_SNAPPY)   {       if (pg_snappy_compress(page, BLCKSZ, buf) ==
EIO)          return NULL;   }   else if (compress_backup_block = BACKUP_BLOCK_COMPRESSION_LZ4)   {       if
(pg_LZ4_compress(page,BLCKSZ, buf) == 0)           return NULL;   }   else if (compress_backup_block =
BACKUP_BLOCK_COMPRESSION_PGLZ)  {       if (pglz_compress(page, BLCKSZ, (PGLZ_Header *) buf,
PGLZ_strategy_default)!= 0)           return NULL;   }   else       elog(ERROR, "Wrong value for compress_backup_block
GUC");
   /*    * …comment about insisting on saving at least two bytes…    */
   if (VARSIZE(buf) >= orig_len - 2)       return NULL;
   *len = VARHDRSIZE + VARSIZE(buf);
   return buf;

I guess it doesn't matter *too* much if the intention is to have all
these compression algorithms only during development/testing and pick
just one in the end. But the above is considerably easier to read in
the meanwhile.

If we were going to keep multiple compression algorithms around, I'd be
inclined to create a "pg_compress(…, compression_algorithm)" function to
hide these return-value differences from the callers.

> +    else if (VARATT_IS_COMPRESSED((struct varlena *) blk) && compress_backup_block!=BACKUP_BLOCK_COMPRESSION_OFF)
> +    {
> +        if (compress_backup_block == BACKUP_BLOCK_COMPRESSION_SNAPPY)
> +        {
> +            int ret;
> +            size_t compressed_length = VARSIZE((struct varlena *) blk) - VARHDRSZ;
> +            char *compressed_data = (char *)VARDATA((struct varlena *) blk);
> +            size_t s_uncompressed_length;
> +
> +            ret = snappy_uncompressed_length(compressed_data,
> +                    compressed_length,
> +                    &s_uncompressed_length);
> +            if (!ret)
> +                elog(ERROR, "snappy: failed to determine compression length");
> +            if (BLCKSZ != s_uncompressed_length)
> +                elog(ERROR, "snappy: compression size mismatch %d != %zu",
> +                        BLCKSZ, s_uncompressed_length);
> +
> +            ret = snappy_uncompress(compressed_data,
> +                    compressed_length,
> +                    page);
> +            if (ret != 0)
> +                elog(ERROR, "snappy: decompression failed: %d", ret);
> +        }

…and a "pg_decompress()" function that does error checking.

> +static const struct config_enum_entry backup_block_compression_options[] = {
> +    {"off", BACKUP_BLOCK_COMPRESSION_OFF, false},
> +    {"false", BACKUP_BLOCK_COMPRESSION_OFF, true},
> +    {"no", BACKUP_BLOCK_COMPRESSION_OFF, true},
> +    {"0", BACKUP_BLOCK_COMPRESSION_OFF, true},
> +    {"pglz", BACKUP_BLOCK_COMPRESSION_PGLZ, true},
> +    {"snappy", BACKUP_BLOCK_COMPRESSION_SNAPPY, true},
> +    {"lz4", BACKUP_BLOCK_COMPRESSION_LZ4, true},
> +    {NULL, 0, false}
> +};

Finally, I don't like the name "compress_backup_block".

1. It should have been plural (compress_backup_blockS).

2. Looking at the enum values, "backup_block_compression = x" would be a  better name anyway…

3. But we don't use the term "backup block" anywhere in the  documentation, and it's very likely to confuse people.

I don't mind the suggestion elsewhere in this thread to use
"full_page_compression = y" (as a setting alongside
"torn_page_protection = x").

I haven't tried the patch (other than applying and building it) yet. I
will do so after I hear what you and others think of the above points.

-- Abhijit



pgsql-hackers by date:

Previous
From: Ian Barwick
Date:
Subject: [REVIEW] psql tab completion for DROP TRIGGER/RULE and ALTER TABLE ... DISABLE/ENABLE
Next
From: Andres Freund
Date:
Subject: releaseOk and LWLockWaitForVar