> On Thursday, September 27, 2012 4:12 PM Heikki Linnakangas wrote:
> On 25.09.2012 18:27, Amit Kapila wrote:
> > If you feel it is must to do the comparison, we can do it in same way
> > as we identify for HOT?
>
> Yeah. (But as discussed, I think it would be even better to just treat
> the old and new tuple as an opaque chunk of bytes, and run them through
> a generic delta algorithm).
>
Thank you for the modified patch.
> The conclusion is that there isn't very much difference among the
> patches. They all squeeze the WAL to about the same size, and the
> increase in TPS is roughly the same.
>
> I think more performance testing is required. The modified pgbench test
> isn't necessarily very representative of a real-life application. The
> gain (or loss) of this patch is going to depend a lot on how many
> columns are updated, and in what ways. Need to test more scenarios,
> with many different database schemas.
>
> The LZ approach has the advantage that it can take advantage of all
> kinds of similarities between old and new tuple. For example, if you
> swap the values of two columns, LZ will encode that efficiently. Or if
> you insert a character in the middle of a long string. On the flipside,
> it's probably more expensive. Then again, you have to do a memcmp() to
> detect which columns have changed with your approach, and that's not
> free either. That was not yet included in the patch version I tested.
> Another consideration is that when you compress the record more, you
> have less data to calculate CRC for. CRC calculation tends to be quite
> expensive, so even quite aggressive compression might be a win. Yet
> another consideration is that the compression/encoding is done while
> holding a lock on the buffer. For the sake of concurrency, you want to
> keep the duration the lock is held as short as possible.
Now I shall do the various tests for following and post it here:
a. Attached Patch in the mode where it takes advantage of history tuple
b. By changing the logic for modified column calculation to use calculation
for memcmp()
With Regards,
Amit Kapila.