Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id CAB7nPqSPFiDpC65czRmzKgRbzRRpAFjYvKEiZ1t4zyC8cbmOnQ@mail.gmail.com
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-hackers
On Sat, Dec 13, 2014 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Dec 12, 2014 at 10:04 AM, Andres Freund <andres@anarazel.de> wrote:
>>> Note that autovacuum and fsync are off.
>>> =# select phase, user_diff, system_diff,
>>> pg_size_pretty(pre_update - pre_insert),
>>> pg_size_pretty(post_update - pre_update) from results;
>>>        phase        | user_diff | system_diff | pg_size_pretty |
>>> pg_size_pretty
>>> --------------------+-----------+-------------+----------------+----------------
>>>  Compression FPW    | 42.990799 |    0.868179 | 429 MB         | 567 MB
>>>  No compression     | 25.688731 |    1.236551 | 429 MB         | 727 MB
>>>  Compression record | 56.376750 |    0.769603 | 429 MB         | 566 MB
>>> (3 rows)
>>> If we do record-level compression, we'll need to be very careful in
>>> defining a lower-bound to not eat unnecessary CPU resources, perhaps
>>> something that should be controlled with a GUC. I presume that this stands
>>> true as well for the upper bound.
>>
>> Record level compression pretty obviously would need a lower boundary
>> for when to use compression. It won't be useful for small heapam/btree
>> records, but it'll be rather useful for large multi_insert, clean or
>> similar records...
>
> Unless I'm missing something, this test is showing that FPW
> compression saves 298MB of WAL for 17.3 seconds of CPU time, as
> against master.  And compressing the whole record saves a further 1MB
> of WAL for a further 13.39 seconds of CPU time.  That makes
> compressing the whole record sound like a pretty terrible idea - even
> if you get more benefit by reducing the lower boundary, you're still
> burning a ton of extra CPU time for almost no gain on the larger
> records.  Ouch!
>
> (Of course, I'm assuming that Michael's patch is reasonably efficient,
> which might not be true.)
Note that I was curious about the worst-case ever, aka how much CPU
pg_lzcompress would use if everything is compressed, even the smallest
records. So we'll surely need a lower-bound. I think that doing some
tests with a lower bound set as a multiple of SizeOfXLogRecord would
be fine, but in this case what we'll see is a result similar to what
FPW compression does.
-- 
Michael



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: moving from contrib to bin
Next
From: Peter Geoghegan
Date:
Subject: Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching