Re: Performance Improvement by reducing WAL for Update Operation - Mailing list pgsql-hackers

From Haribabu kommi
Subject Re: Performance Improvement by reducing WAL for Update Operation
Date
Msg-id 8977CB36860C5843884E0A18D8747B0372BF1167@szxeml558-mbs.china.huawei.com
Whole thread Raw
In response to Re: Performance Improvement by reducing WAL for Update Operation  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Performance Improvement by reducing WAL for Update Operation
List pgsql-hackers
On 29 November 2013 03:05 Robert Haas wrote:
> On Wed, Nov 27, 2013 at 9:31 AM, Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > Sure, but to explore (a), the scope is bit bigger. We have below
> > options to explore (a):
> > 1. try to optimize existing algorithm as used in patch, which we have
> > tried but ofcourse we can spend some more time to see if anything
> more
> >     can be tried out.
> > 2. try fingerprint technique as suggested by you above.
> > 3. try some other standard methods like vcdiff, lz4 etc.
>
> Well, obviously, I'm hot on idea #2 and think that would be worth
> spending some time on.  If we can optimize the algorithm used in the
> patch some more (option #1), that would be fine, too, but the code
> looks pretty tight to me, so I'm not sure how successful that's likely
> to be.  But if you have an idea, sure.

I tried modifying the existing patch to support the dynamic rollup as follows.
For every 32 bytes mismatch between the old and new tuple and it resets back whenever it found a match.

1. pglz-with-micro-optimization-compress-using-newdata-5:

Adds all old tuple data to history and then check for the match from new tuple.
For every 32 bytes mismatch, it checks for the match for 2 bytes once. Like this
It repeats until it found a match or end of data.

2. pglz-with-micro-optimization-compress-using-newdata_snappy_hash-1:

Adds only first byte of old tuple data to the history and then check for the match
>From new tuple. If any match found, then next unmatched byte from old tuple is added
To the history and repeats the process.

If no match founds then adds the next byte of the old tuple history followed by the
Unmatched byte from new tuple data to the history.

In this case the performance is good, but if there is any forward references in the
New data with old data then it will not compress the data.

Eg- old data - 12345     abcdefgh
    New data - abcdefgh  56789

The updated patches and performance data is attached in the mail.
Please let me know your suggestions.

Regards,
Hari babu.


Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Extension Templates S03E11
Next
From: Dimitri Fontaine
Date:
Subject: Re: Extension Templates S03E11