Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Rahila Syed
Subject Re: Compression of full-page-writes
Date
Msg-id CAH2L28uKFngZj7hVWF_x_yq7r_3OSXa=VCAhK+V0abs1urvfUg@mail.gmail.com
Whole thread Raw
In response to Re: Compression of full-page-writes  (Rahila Syed <rahilasyed.90@gmail.com>)
Responses Re: Compression of full-page-writes  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
<div dir="ltr"><p class="MsoNormal">Hello ,<p class="MsoNormal"><br /><p class="MsoNormal">In order to facilitate
changingof compression algorithms  and to be able to recover using WAL records compressed with different compression
algorithms,information about compression algorithm can be stored in WAL record.<p class="MsoNormal">XLOG record header
has2 to 4 padding bytes in order to align the WAL record. This space can be used for  a new flag in order to store
informationabout the compression algorithm used. Like the xl_info field of XlogRecord struct,  8 bits flag  can be
constructedwith the lower 4 bits of the flag used to indicate which backup block is compressed out of 0,1,2,3. Higher
fourbits can be used to indicate state of compression i.e off,lz4,snappy,pglz.<p class="MsoNormal">The flag can be
extendedto incorporate more compression algorithms added in future if any.<p class="MsoNormal">What is your opinion on
this?<pclass="MsoNormal"><br /><p class="MsoNormal">Thank you,<p class="MsoNormal">Rahila Syed</div><div
class="gmail_extra"><br/><br /><div class="gmail_quote">On Tue, May 27, 2014 at 9:27 AM, Rahila Syed <span
dir="ltr"><<ahref="mailto:rahilasyed.90@gmail.com" target="_blank">rahilasyed.90@gmail.com</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello All,<br
/><br/> 0001-CompressBackupBlock_snappy_lz4_pglz extends patch on compression of<br /> full page writes to include LZ4
andSnappy . Changes include making<br /> "compress_backup_block" GUC from boolean to enum. Value of the GUC can be<br
/>OFF, pglz, snappy or lz4 which can be used to turn off compression or set<br /> the desired compression algorithm.<br
/><br/> 0002-Support_snappy_lz4 adds support for LZ4 and Snappy in PostgreSQL. It<br /> uses Andres’s patch for getting
Makefilesworking and has a few wrappers to<br /> make the function calls to LZ4 and Snappy compression functions and
handle<br/> varlena datatypes.<br /> Patch Courtesy: Pavan Deolasee<br /><br /> These patches serve as a way to test
variouscompression algorithms. These<br /> are WIP yet. They don’t support changing compression algorithms on standby
.<br/> Also, compress_backup_block GUC needs to be merged with full_page_writes.<br /> The patch uses LZ4 high
compression(HC)variant.<br /> I have conducted initial tests which I would like to share and solicit<br /> feedback<br
/><br/> Tests use JDBC runner TPC-C benchmark to measure the amount of WAL<br /> compression ,tps and response time in
eachof the scenarios viz .<br /> Compression = OFF , pglz, LZ4 , snappy ,FPW=off<br /><br /> Server specifications:<br
/>Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos<br /> RAM: 32GB<br /> Disk : HDD      450GB
10KHot Plug 2.5-inch SAS HDD * 8 nos<br /> 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm<br /><br /><br />
Benchmark:<br/> Scale : 100<br /> Command  :java JR  /home/postgres/jdbcrunner-1.2/scripts/tpcc.js  -sleepTime<br />
600,350,300,250,250<br/> Warmup time          : 1 sec<br /> Measurement time     : 900 sec<br /> Number of tx types   :
5<br/> Number of agents     : 16<br /> Connection pool size : 16<br /> Statement cache size : 40<br /> Auto commit    
    : false<br /> Sleep time           : 600,350,300,250,250 msec<br /><br /> Checkpoint segments:1024<br /> Checkpoint
timeout:5mins<br /><br /><br /> Scenario           WAL generated(bytes)                   Compression<br /> (bytes)    
 TPS (tx1,tx2,tx3,tx4,tx5)<br /> No_compress      2220787088 (~2221MB)                 NULL<br /> 13.3,13.3,1.3,1.3,1.3
tps<br/> Pglz                  1796213760 (~1796MB)                 424573328<br /> (19.11%)     13.1,13.1,1.3,1.3,1.3
tps<br/> Snappy             1724171112 (~1724MB)                 496615976( 22.36%)<br /> 13.2,13.2,1.3,1.3,1.3 tps<br
/>LZ4(HC)            1658941328 (~1659MB)                 561845760(25.29%)<br /> 13.2,13.2,1.3,1.3,1.3 tps<br />
FPW(off)          139384320(~139 MB)                    NULL<br /> 13.3,13.3,1.3,1.3,1.3 tps<br /><br /> As per
measurementresults, WAL reduction using LZ4 is close to 25% which<br /> shows 6 percent increase in WAL reduction when
comparedto pglz . WAL<br /> reduction in snappy is close to 22 % .<br /> The numbers for compression using LZ4 and
Snappydoesn’t seem to be very<br /> high as compared to pglz for given workload. This can be due to<br />
in-compressiblenature of the TPC-C data which contains random strings<br /><br /> Compression does not have bad impact
onthe response time. In fact, response<br /> times for Snappy, LZ4 are much better than no compression with almost ½
to<br/> 1/3 of the response times of no-compression(FPW=on) and FPW = off.<br /> The response time order for each  type
ofcompression is<br /> Pglz>Snappy>LZ4<br /><br /> Scenario              Response time (tx1,tx2,tx3,tx4,tx5)<br
/>no_compress        5555,1848,4221,6791,5747 msec<br /> pglz                    4275,2659,1828,4025,3326 msec<br />
Snappy              3790,2828,2186,1284,1120 msec<br /> LZ4(hC)              2519,2449,1158,2066,2065 msec<br />
FPW(off)            6234,2430,3017,5417,5885 msec<br /><br /> LZ4 and Snappy are almost at par with each other in terms
ofresponse time<br /> as average response times of five types of transactions remains almost same<br /> for both.<br />
0001-CompressBackupBlock_snappy_lz4_pglz.patch<br/> <<a
href="http://postgresql.1045698.n5.nabble.com/file/n5805044/0001-CompressBackupBlock_snappy_lz4_pglz.patch"
target="_blank">http://postgresql.1045698.n5.nabble.com/file/n5805044/0001-CompressBackupBlock_snappy_lz4_pglz.patch</a>><br
/>0002-Support_snappy_lz4.patch<br /> <<a
href="http://postgresql.1045698.n5.nabble.com/file/n5805044/0002-Support_snappy_lz4.patch"
target="_blank">http://postgresql.1045698.n5.nabble.com/file/n5805044/0002-Support_snappy_lz4.patch</a>><br/><br
/><br/><br /><br /> --<br /> View this message in context: <a
href="http://postgresql.1045698.n5.nabble.com/Compression-of-full-page-writes-tp5769039p5805044.html"
target="_blank">http://postgresql.1045698.n5.nabble.com/Compression-of-full-page-writes-tp5769039p5805044.html</a><br/>
Sentfrom the PostgreSQL - hackers mailing list archive at Nabble.com.<br /><span class="HOEnZb"><font
color="#888888"><br/><br /> --<br /> Sent via pgsql-hackers mailing list (<a
href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your
subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></font></span></blockquote></div><br /></div> 

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: /proc/self/oom_adj is deprecated in newer Linux kernels
Next
From: Tom Lane
Date:
Subject: Re: /proc/self/oom_adj is deprecated in newer Linux kernels