Re: Performance Improvement by reducing WAL for Update Operation - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Performance Improvement by reducing WAL for Update Operation
Date
Msg-id CAA4eK1+H26zEHxoSB8+aURxket2avb1hrAxfJWqnve3y=ePQEw@mail.gmail.com
Whole thread Raw
In response to Re: Performance Improvement by reducing WAL for Update Operation  (Claudio Freire <klaussfreire@gmail.com>)
Responses Re: Performance Improvement by reducing WAL for Update Operation  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Thu, Feb 13, 2014 at 10:07 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Thu, Feb 13, 2014 at 1:20 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Here one of the improvements which can be done is that after prefix-suffix
>> match, instead of going byte-by-byte copy as per LZ format we can directly
>> copy all the remaining part of tuple but I think that would require us to use
>> some different format than LZ which is also not too difficult to do, but the
>> question is do we really need such a change to handle the above kind of
>> worst case.
>
>
> Why use LZ at all?

We are just using LZ *format* to represent compressed string.
Just copied some text from pg_lzcompress.c, to explain what
exactly we are using

"the first byte after the header tells what to dothe next 8 times. We call this the control byte.

An unset bit in the control byte means, that one uncompressed
byte follows, which is copied from input to output.
A set bit in the control byte means, that a tag of 2-3 bytes
follows. A tag contains information to copy some bytes, that
are already in the output buffer, to the current location in
the output."

> Why not *only* prefix/suffix?

To represent prefix/suffix match, we atleast need a way to tell
that the offset and len of matched bytes and then how much
is the length of unmatched bytes we have copied.
I agree that a simpler format could be devised if we just want to
do prefix-suffix match, but that would require much more test
during recovery to ensure everything is fine, advantage with LZ
format is that we don't need to bother about decoding, it will work
as without any much change in LZ decode routine.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Claudio Freire
Date:
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Next
From: Robert Haas
Date:
Subject: Re: Performance Improvement by reducing WAL for Update Operation