Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date
Msg-id 009001cd9b32$394a4090$abdec1b0$@kapila@huawei.com
Whole thread Raw
In response to Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation
List pgsql-hackers
> On Tuesday, September 25, 2012 7:30 PM Heikki Linnakangas wrote:
> On 24.09.2012 13:57, Amit kapila wrote:
> > Rebased version of patch based on latest code.
> 
> When HOT was designed, we decided that heap_update needs to compare the
> old and new attributes directly, with memcmp(), to determine whether
> any
> of the indexed columns have changed. It was not deemed infeasible to
> pass down that information from the executor. I don't remember the
> details of why that was, but you seem to trying to same thing in this
> patch, and pass the bitmap of modified cols from the executor to
> heap_update(). I'm pretty sure that won't work, for the same reasons we
> didn't do it for HOT.

I think the reason of not relying on modified columns can be some such case
where modified columns might not give the correct information. 
It may be due to Before triggers can change the modified columns that's why
for HOT update we need to do 
Comparison. In our case we have taken care of such a case by not doing
optimization, so not relying on modified columns.

If you feel it is must to do the comparison, we can do it in same way as we
identify for HOT? 

> I still feel that it would probably be better to use a generic delta
> encoding scheme, instead of inventing one. How about VCDIFF
> (http://tools.ietf.org/html/rfc3284), for example? Or you could reuse
> the LZ compressor that we already have in the source tree. You can use
> LZ for delta compression by initializing the history buffer of the
> algorithm with the old tuple, and then compressing the new tuple as
> usual. 

>Or you could still use the knowledge of where the attributes
> begin and end and which attributes were updated, and do the encoding
> similar to how you did in the patch, but use LZ as the output format.
> That way the decoding would be the same as LZ decompression.
 Can you please explain me why you think that after doing encoding doing LZ
compression on it is better, as already we have reduced the amount of WAL
for update by only storing changed column information?

a. is it to further reduce the size of WAL
b. storing diff WAL in some standard format
c. or does it give any other kind of benefit

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Oid registry
Next
From: Tom Lane
Date:
Subject: Re: Oid registry