Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id 54134B99.6030806@vmware.com
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes  (Abhijit Menon-Sen <ams@2ndQuadrant.com>)
Re: [REVIEW] Re: Compression of full-page-writes  (Ants Aasma <ants@cybertec.at>)
Re: [REVIEW] Re: Compression of full-page-writes  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On 09/02/2014 09:52 AM, Fujii Masao wrote:
> [RESULT]
> Throughput in the benchmark.
>
>                              Multiple                    Single
>          off                    2162.6                    2164.5
>          on                    891.8                    895.6
>          pglz                    1037.2                    1042.3
>          lz4                    1084.7                    1091.8
>          snappy                    1058.4                    1073.3

Most of the CPU overhead of writing full pages is because of CRC 
calculation. Compression helps because then you have less data to CRC.

It's worth noting that there are faster CRC implementations out there 
than what we use. The Slicing-by-4 algorithm was discussed years ago, 
but was not deemed worth it back then IIRC because we typically 
calculate CRC over very small chunks of data, and the benefit of 
Slicing-by-4 and many other algorithms only show up when you work on 
larger chunks. But a full-page image is probably large enough to benefit.

What I'm trying to say is that this should be compared with the idea of 
just switching the CRC implementation. That would make the 'on' case 
faster, and and the benefit of compression smaller. I wouldn't be 
surprised if it made the 'on' case faster than compressed cases.

I don't mean that we should abandon this patch - compression makes the 
WAL smaller which has all kinds of other benefits, even if it makes the 
raw TPS throughput of the system worse. But I'm just saying that these 
TPS comparisons should be taken with a grain of salt. We probably should 
consider switching to a faster CRC algorithm again, regardless of what 
we do with compression.

- Heikki




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: pgbench throttling latency limit
Next
From: Tomas Vondra
Date:
Subject: Re: bad estimation together with large work_mem generates terrible slow hash joins