On Mon, Dec 8, 2014 at 3:42 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:
>
>>The important point to consider for this patch is the use of the additional
>> 2-bytes as uint16 in the block information structure to save the length of a
>> compressed
>>block, which may be compressed without its hole to achieve a double level
>> of compression (image compressed without its hole). We may use a simple flag
>> on
>>one or two bits using for example a bit from hole_length, but in this case
>> we would need to always compress images with their hole included, something
>> more
> >expensive as the compression would take more time.
> As you have mentioned here the idea to use bits from existing fields rather
> than adding additional 2 bytes in header,
> FWIW elaborating slightly on the way it was done in the initial patches,
> We can use the following struct
>
> unsigned hole_offset:15,
> compress_flag:2,
> hole_length:15;
>
> Here compress_flag can be 0 or 1 depending on status of compression. We can
> reduce the compress_flag to just 1 bit flag.
Just adding that this is fine as the largest page size that can be set is 32k.
> IIUC, the purpose of adding compress_len field in the latest patch is to
> store length of compressed blocks which is used at the time of decoding the
> blocks.
>
> With this approach, length of compressed block can be stored in hole_length
> as,
>
> hole_length = BLCKSZ - compress_len.
>
> Thus, hole_length can serve the purpose of storing length of a compressed
> block without the need of additional 2-bytes. In DecodeXLogRecord,
> hole_length can be used for tracking the length of data received in cases of
> both compressed as well as uncompressed blocks.
>
> As you already mentioned, this will need compressing images with hole but
> we can MemSet hole to 0 in order to make compression of hole less expensive
> and effective.
Thanks for coming back to this point in more details, this is very
important. The additional 2 bytes used make compression less expensive
by ignoring the hole, for a bit more data in each record. Using uint16
is as well a cleaner code style, more in-line wit hte other fields,
but that's a personal opinion ;)
Doing a switch from one approach to the other is easy enough though,
so let's see what others think.
Regards,
--
Michael