Re: libpq compression - Mailing list pgsql-hackers
From | Florian Pflug |
---|---|
Subject | Re: libpq compression |
Date | |
Msg-id | 73447F47-E9A3-420B-8903-9F6A4513E229@phlo.org Whole thread Raw |
In response to | Re: libpq compression (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: libpq compression
|
List | pgsql-hackers |
On Jun25, 2012, at 04:04 , Robert Haas wrote: > If, for > example, someone can demonstrate that an awesomebsdlz compresses 10x > as fast as OpenSSL... that'd be pretty compelling. That, actually, is demonstrably the case for at least Google's snappy. (and LZO, but that's not an option since its license is GPL) They state in their documentation that In our tests, Snappy usually is faster than algorithms in the same class (e.g. LZO, LZF, FastLZ, QuickLZ, etc.) while achievingcomparable compression ratios. The only widely supported compression method for SSL seems to be DEFLATE, which is also what gzip/zlib uses. I've benchmarked LZO against gzip/zlib a few months ago, and LZO outperformed zlib in fast mode (i.e. gzip -1) by an order of magnitude. The compression ratio achieved by DEFLATE/gzip/zlib is much better, though. The snappy documentation states Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x for plain text, about 2-4x for HTML, and ofcourse 1.0x for JPEGs, PNGs and other already-compressed data. Similar numbers for zlib in its fastest mode are 2.6-2.8x,3-7x and 1.0x, respectively. Here are a few numbers for LZO vs. gzip. Snappy should be comparable to LZO - I tested LZO because I still had the command-line compressor lzop lying around on my machine, whereas I'd have needed to download and compile snappy first. $ dd if=/dev/random of=data bs=1m count=128 $ time gzip -1 < data > data.gz real 0m6.189s user 0m5.947s sys 0m0.224s $ time lzop < data > data.lzo real 0m2.697s user 0m0.295s sys 0m0.224s $ ls -lh data* -rw-r--r-- 1 fgp staff 128M Jun 25 14:43 data -rw-r--r-- 1 fgp staff 128M Jun 25 14:44 data.gz -rw-r--r-- 1 fgp staff 128M Jun 25 14:44 data.lzo $ dd if=/dev/zero of=zeros bs=1m count=128 $ time gzip -1 < zeros > zeros.gz real 0m1.083s user 0m1.019s sys 0m0.052s $ time lzop < zeros > zeros.lzo real 0m0.186s user 0m0.123s sys 0m0.053s $ ls -lh zeros* -rw-r--r-- 1 fgp staff 128M Jun 25 14:47 zeros -rw-r--r-- 1 fgp staff 572K Jun 25 14:47 zeros.gz -rw-r--r-- 1 fgp staff 598K Jun 25 14:47 zeros.lzo To summarize, on my 2.66 Ghz Core2 Duo Macbook Pro, LZO compresses about 350MB/s if the data is purely random, and about 800MB/s if the data compresses extremely well. (Numbers based on user time since that indicates the CPU time used, and ignores the IO overhead, which is substantial) IMHO, the only compelling argument (and a very compelling one) to use SSL compression was that it requires very little code on our side. We've since discovered that it's not actually that simple, at least if we want to support compression without authentication or encryption, and don't want to restrict ourselves to using OpenSSL forever. So unless we give up at least one of those requirements, the arguments for using SSL-compression are rather thin, I think. best regards, Florian Pflug
pgsql-hackers by date: