Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date
Msg-id 5061B8E1.9010209@vmware.com
Whole thread Raw
In response to Re: [WIP] Performance Improvement by reducing WAL for Update Operation  (Amit kapila <amit.kapila@huawei.com>)
Responses Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation
List pgsql-hackers
On 24.09.2012 13:57, Amit kapila wrote:
> Rebased version of patch based on latest code.

When HOT was designed, we decided that heap_update needs to compare the 
old and new attributes directly, with memcmp(), to determine whether any 
of the indexed columns have changed. It was not deemed infeasible to 
pass down that information from the executor. I don't remember the 
details of why that was, but you seem to trying to same thing in this 
patch, and pass the bitmap of modified cols from the executor to 
heap_update(). I'm pretty sure that won't work, for the same reasons we 
didn't do it for HOT.

I still feel that it would probably be better to use a generic delta 
encoding scheme, instead of inventing one. How about VCDIFF 
(http://tools.ietf.org/html/rfc3284), for example? Or you could reuse 
the LZ compressor that we already have in the source tree. You can use 
LZ for delta compression by initializing the history buffer of the 
algorithm with the old tuple, and then compressing the new tuple as 
usual. Or you could still use the knowledge of where the attributes 
begin and end and which attributes were updated, and do the encoding 
similar to how you did in the patch, but use LZ as the output format. 
That way the decoding would be the same as LZ decompression.

- Heikki



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Switching timeline over streaming replication
Next
From: Marko Tiikkaja
Date:
Subject: Re: Patch: incorrect array offset in backend replication tar header