Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date
Msg-id 50642D6A.5080608@vmware.com
Whole thread Raw
In response to Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation  (Amit Kapila <amit.kapila@huawei.com>)
Responses Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation
List pgsql-hackers
On 25.09.2012 18:27, Amit Kapila wrote:
> If you feel it is must to do the comparison, we can do it in same way as we
> identify for HOT?

Yeah. (But as discussed, I think it would be even better to just treat
the old and new tuple as an opaque chunk of bytes, and run them through
a generic delta algorithm).

>    Can you please explain me why you think that after doing encoding doing LZ
> compression on it is better, as already we have reduced the amount of WAL
> for update by only storing changed column information?
>
> a. is it to further reduce the size of WAL
> b. storing diff WAL in some standard format
> c. or does it give any other kind of benefit

Potentially all of those. I don't know if it'd be better or worse, but
my gut feeling is that it would be simpler, and produce even more
compact WAL.

Attached is a simple patch to apply LZ compression to update WAL
records. I modified the LZ compressor so that it can optionally use a
separate "history" data, and the same history data must then be passed
to the decompression function. That makes it work as a pretty efficient
delta encoder, when you use the old tuple as the history data.

I ran some performance tests with the modified version of pgbench that
you posted earlier:

Current PostgreSQL master
-------------------------

tps = 941.601924 (excluding connections establishing)
  pg_xlog_location_diff
-----------------------
              721227944

pglz_wal_update_records.patch
-----------------------------

tps = 1039.792527 (excluding connections establishing)
  pg_xlog_location_diff
-----------------------
              419395208

pglz_wal_update_records.patch, COMPRESS_ONLY
--------------------------------------------

tps = 1009.682002 (excluding connections establishing)
  pg_xlog_location_diff
-----------------------
              422505104


Amit's wal_update_changes_hot_update.patch
------------------------------------------

tps = 1092.703883 (excluding connections establishing)
  pg_xlog_location_diff
-----------------------
              436031544


The COMPRESS_ONLY result is with the attached patch, but it just uses LZ
to compress the new tuple, without taking advantage of the old tuple.
The pg_xlog_location_diff value is the amount of WAL generated during
the pgbench run. Attached is also the shell script I used to run these
tests.

The conclusion is that there isn't very much difference among the
patches. They all squeeze the WAL to about the same size, and the
increase in TPS is roughly the same.

I think more performance testing is required. The modified pgbench test
isn't necessarily very representative of a real-life application. The
gain (or loss) of this patch is going to depend a lot on how many
columns are updated, and in what ways. Need to test more scenarios, with
many different database schemas.

The LZ approach has the advantage that it can take advantage of all
kinds of similarities between old and new tuple. For example, if you
swap the values of two columns, LZ will encode that efficiently. Or if
you insert a character in the middle of a long string. On the flipside,
it's probably more expensive. Then again, you have to do a memcmp() to
detect which columns have changed with your approach, and that's not
free either. That was not yet included in the patch version I tested.
Another consideration is that when you compress the record more, you
have less data to calculate CRC for. CRC calculation tends to be quite
expensive, so even quite aggressive compression might be a win. Yet
another consideration is that the compression/encoding is done while
holding a lock on the buffer. For the sake of concurrency, you want to
keep the duration the lock is held as short as possible.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: autovacuum stress-testing our system
Next
From: Amit Kapila
Date:
Subject: Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation