Home > mailing lists

Re: Compression of full-page-writes - Mailing list pgsql-hackers

From	Rahila Syed
Subject	Re: Compression of full-page-writes
Date	June 13, 2014 17:37:38
Msg-id	CAH2L28vkYQdBQ_SOEYA9Rsrvc2YrQZsN6jwVA3zXup=Ekw8nDg@mail.gmail.com Whole thread Raw
In response to	Re: Compression of full-page-writes (Rahila Syed <rahilasyed.90@gmail.com>)
Responses	[REVIEW] Re: Compression of full-page-writes (Abhijit Menon-Sen <ams@2ndQuadrant.com>)
List	pgsql-hackers

Tree view

Hello,

The attached patch named CompressBackupBlock_snappy_lz4_pglz accomplishes compression of FPW in WAL using pglz ,LZ4 and Snappy. This serves as a means to test performance of various compression algorithms for FPW compression

Minor correction in check for compression/decompression is made since the last time it was posted.

Patch named Support-for-lz4-and-snappy adds support for LZ4 and Snappy in PostgreSQL.

Below are the performance numbers taken for various values of compress_backup_block GUC parameter.

Scenario Amount of WAL(bytes) Compression (bytes) WALRecovery time(secs) TPS

FPW(on)Compression(Off) 1393681216 (~1394MB) NA 17 s 15.8,15.8,1.6,1.6,1.6 tps

Pglz 1192524560 (~1193 MB) 14% 17 s 15.6,15.6,1.6,1.6,1.6 tps

LZ4 1124745880 (~1125MB) 19.2% 16 s 15.7,15.7,1.6,1.6,1.6 tps

Snappy 1123117704 (~1123MB) 19.4% 17 s 15.6,15.6,1.6,1.6,1.6 tps

FPW (off) 171287384 ( ~171MB) NA 12 s 16.0,16.0,1.6,1.6,1.6 tps

Compression ratios of LZ4 and Snappy are almost at par for given workload. The nature of TPC-C type of data used is highly incompressible which explains the low compression ratios.

Turning compression on reduces tps overall. TPS numbers for LZ4 is slightly better than pglz and snappy.

Recovery(decompression) speed of LZ4 is slightly faster than Snappy.

Overall LZ4 scores over Snappy and pglz in terms of recovery (decompression) speed ,TPS and response times. Also, compression of LZ4 is at par with Snappy.

Server specifications:
Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
RAM: 32GB
Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm

Benchmark:
Scale : 16
Command :java JR /home/postgres/jdbcrunner-1.2/scripts/tpcc.js -sleepTime 550,250,250,200,200
Warmup time : 1 sec
Measurement time : 900 sec
Number of tx types : 5
Number of agents : 16
Connection pool size : 16
Statement cache size : 40
Auto commit : false

Checkpoint segments:1024
Checkpoint timeout:5 mins

Limitations of the current patch:

1. The patch currently compresses entire backup block inclusive of ‘hole’ unlike normal code which backs up the part before and after the hole separately. There can be performance issues when ‘hole’ is not filled with zeros. Hence separately compressing parts of block before and after hole can be considered.

2. Patch currently relies on ‘compress_backup_block’ GUC parameter to check if FPW is compressed or not. Information about whether FPW is compressed and which compression algorithm is used can be included in WAL record header . This will enable switching compression off and changing compression algorithm whenever desired.

3. Extending decompression logic to pg_xlogdump.

On Tue, May 27, 2014 at 9:27 AM, Rahila Syed <rahilasyed.90@gmail.com> wrote:

Hello All,

0001-CompressBackupBlock_snappy_lz4_pglz extends patch on compression of
full page writes to include LZ4 and Snappy . Changes include making
"compress_backup_block" GUC from boolean to enum. Value of the GUC can be
OFF, pglz, snappy or lz4 which can be used to turn off compression or set
the desired compression algorithm.

0002-Support_snappy_lz4 adds support for LZ4 and Snappy in PostgreSQL. It
uses Andres’s patch for getting Makefiles working and has a few wrappers to
make the function calls to LZ4 and Snappy compression functions and handle
varlena datatypes.
Patch Courtesy: Pavan Deolasee

These patches serve as a way to test various compression algorithms. These
are WIP yet. They don’t support changing compression algorithms on standby .
Also, compress_backup_block GUC needs to be merged with full_page_writes.
The patch uses LZ4 high compression(HC) variant.
I have conducted initial tests which I would like to share and solicit
feedback

Tests use JDBC runner TPC-C benchmark to measure the amount of WAL
compression ,tps and response time in each of the scenarios viz .
Compression = OFF , pglz, LZ4 , snappy ,FPW=off

Server specifications:
Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
RAM: 32GB
Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm

Benchmark:
Scale : 100
Command :java JR /home/postgres/jdbcrunner-1.2/scripts/tpcc.js -sleepTime
600,350,300,250,250
Warmup time : 1 sec
Measurement time : 900 sec
Number of tx types : 5
Number of agents : 16
Connection pool size : 16
Statement cache size : 40
Auto commit : false
Sleep time : 600,350,300,250,250 msec

Checkpoint segments:1024
Checkpoint timeout:5 mins

Scenario WAL generated(bytes) Compression
(bytes) TPS (tx1,tx2,tx3,tx4,tx5)
No_compress 2220787088 (~2221MB) NULL
13.3,13.3,1.3,1.3,1.3 tps
Pglz 1796213760 (~1796MB) 424573328
(19.11%) 13.1,13.1,1.3,1.3,1.3 tps
Snappy 1724171112 (~1724MB) 496615976( 22.36%)
13.2,13.2,1.3,1.3,1.3 tps
LZ4(HC) 1658941328 (~1659MB) 561845760(25.29%)
13.2,13.2,1.3,1.3,1.3 tps
FPW(off) 139384320(~139 MB) NULL
13.3,13.3,1.3,1.3,1.3 tps

As per measurement results, WAL reduction using LZ4 is close to 25% which
shows 6 percent increase in WAL reduction when compared to pglz . WAL
reduction in snappy is close to 22 % .
The numbers for compression using LZ4 and Snappy doesn’t seem to be very
high as compared to pglz for given workload. This can be due to
in-compressible nature of the TPC-C data which contains random strings

Compression does not have bad impact on the response time. In fact, response
times for Snappy, LZ4 are much better than no compression with almost ½ to
1/3 of the response times of no-compression(FPW=on) and FPW = off.
The response time order for each type of compression is
Pglz>Snappy>LZ4

Scenario Response time (tx1,tx2,tx3,tx4,tx5)
no_compress 5555,1848,4221,6791,5747 msec
pglz 4275,2659,1828,4025,3326 msec
Snappy 3790,2828,2186,1284,1120 msec
LZ4(hC) 2519,2449,1158,2066,2065 msec
FPW(off) 6234,2430,3017,5417,5885 msec

LZ4 and Snappy are almost at par with each other in terms of response time
as average response times of five types of transactions remains almost same
for both.
0001-CompressBackupBlock_snappy_lz4_pglz.patch
<http://postgresql.1045698.n5.nabble.com/file/n5805044/0001-CompressBackupBlock_snappy_lz4_pglz.patch>
0002-Support_snappy_lz4.patch
<http://postgresql.1045698.n5.nabble.com/file/n5805044/0002-Support_snappy_lz4.patch>

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Compression-of-full-page-writes-tp5769039p5805044.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

From: Pavel Stehule
Date: 13 June 2014, 17:32:22
Subject: Re: PL/pgSQL support to define multi variables once

From: Tom Lane
Date: 13 June 2014, 17:43:22
Subject: Re: PL/pgSQL support to define multi variables once

Re: Compression of full-page-writes - Mailing list pgsql-hackers

Attachment

Previous

Next