Re: Compression of full-page-writes - Mailing list pgsql-hackers

From KONDO Mitsumasa
Subject Re: Compression of full-page-writes
Date
Msg-id 52650BBA.2050403@lab.ntt.co.jp
Whole thread Raw
In response to Re: Compression of full-page-writes  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Compression of full-page-writes
List pgsql-hackers
(2013/10/19 14:58), Amit Kapila wrote:> On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa>
<kondo.mitsumasa@lab.ntt.co.jp>wrote:> I think in general also snappy is mostly preferred for it's low CPU> usage not
forcompression, but overall my vote is also for snappy.
 
I think low CPU usage is the best important factor in WAL compression.
It is because WAL write is sequencial write, so few compression ratio improvement 
cannot change PostgreSQL's performance, and furthermore raid card with writeback 
feature. Furthermore PG executes programs by single proccess, high CPU usage 
compression algorithm will cause lessor performance.
>> I found compression algorithm test in HBase. I don't read detail, but it>> indicates snnapy algorithm gets best
performance.>>
 
http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of>> The dataset used for
performanceis quite different from the data> which we are talking about here (WAL).> "These are the scores for a data
whichconsist of 700kB rows, each> containing a binary image data. They probably won’t apply to things> like numeric or
textdata."
 
Yes, you are right. We need testing about compression algorithm in WAL write.
>> I think it is necessary to make best efforts in community than I do the best>> choice with strict test.>> Sure, it
isgood to make effort to select the best algorithm, but if> you are combining this patch with inclusion of new
compression>algorithm in PG, it can only make the patch to take much longer time.
 
I think if our direction is specifically decided, it is easy to make the patch.
Complession patch's direction isn't still become clear, it will be a troublesome 
patch which is like sync-rep patch.
> In general, my thinking is that we should prefer compression to reduce> IO (WAL volume), because reducing WAL volume
hasother benefits as> well like sending it to subscriber nodes. I think it will help cases> where due to less n/w
bandwidth,the disk allocated for WAL becomes> full due to high traffic on master and then users need some> alternative
methodsto handle such situations.
 
Do you talk about archiving WAL file? It can easy to reduce volume that we set 
and add compression command with copy command at archive_command.
> I think many users would like to use a method which can reduce WAL> volume and the users which don't find it enough
usefulin their> environments due to decrease in TPS or not significant reduction in> WAL have the option to disable
it.
I favor to select compression algorithm for higher performance. If we need to 
compress WAL file more, in spite of lessor performance, we can change archive 
copy command with high compression algorithm and add documents that how to 
compress archive WAL files at archive_command. Does it wrong? In actual, many of 
NoSQLs use snappy for purpose of higher performance.

Regards,
-- 
Mitsumasa KONDO
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Improve setup for documentation building with FOP
Next
From: KONDO Mitsumasa
Date:
Subject: Re: Add min and max execute statement time in pg_stat_statement