Re: libpq compression - Mailing list pgsql-hackers

From Robert Haas
Subject Re: libpq compression
Date
Msg-id CA+TgmoYAuMQS1BHF5v1L9Aofh77_9TdQ3bFOWMHLQ4CoY_D=7w@mail.gmail.com
Whole thread Raw
In response to Re: libpq compression  ("ktm@rice.edu" <ktm@rice.edu>)
Responses Re: libpq compression
Re: libpq compression
List pgsql-hackers
On Mon, Jun 25, 2012 at 4:25 PM, ktm@rice.edu <ktm@rice.edu> wrote:
> On Mon, Jun 25, 2012 at 09:45:26PM +0200, Florian Pflug wrote:
>> On Jun25, 2012, at 21:21 , Dimitri Fontaine wrote:
>> > Magnus Hagander <magnus@hagander.net> writes:
>> >> Or that it takes less code/generates cleaner code...
>> >
>> > So we're talking about some LZO things such as snappy from google, and
>> > that would be another run time dependency IIUC.
>> >
>> > I think it's time to talk about fastlz:
>> >
>> >  http://fastlz.org/
>> >  http://code.google.com/p/fastlz/source/browse/trunk/fastlz.c
>> >
>> >  551 lines of C code under MIT license, works also under windows
>> >
>> > I guess it would be easy (and safe) enough to embed in our tree should
>> > we decide going this way.
>>
>> Agreed. If we extend the protocol to support compression, and not rely
>> on SSL, then let's pick one of these LZ77-style compressors, and let's
>> integrate it into our tree.
>>
>> We should then also make it possible to enable compression only for
>> the server -> client direction. Since those types of compressions are
>> usually pretty easy to decompress, that reduces the amount to work
>> non-libpq clients have to put in to take advantage of compression.
>
> Here is the benchmark list from the Google lz4 page:
>
> Name            Ratio   C.speed D.speed
> LZ4 (r59)       2.084   330      915
> LZO 2.05 1x_1   2.038   311      480
> QuickLZ 1.5 -1  2.233   257      277
> Snappy 1.0.5    2.024   227      729
> LZF             2.076   197      465
> FastLZ          2.030   190      420
> zlib 1.2.5 -1   2.728    39      195
> LZ4 HC (r66)    2.712    18     1020
> zlib 1.2.5 -6   3.095    14      210
>
> lz4 absolutely dominates on compression/decompression speed. While fastlz
> is faster than zlib(-1) on compression, lz4 is almost 2X faster still.

At the risk of making everyone laugh at me, has anyone tested pglz?  I
observe that if the compression ration and performance are good, we
might consider using it for this purpose, too, which would avoid
having to add dependencies.  Conversely, if they are bad, and we
decide to support another algorithm, we might consider also using that
other algorithm, at least optionally, for the purposes for which we
now use pglz.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Reporting WAL file containing checkpoint's REDO record in pg_controldata's result
Next
From: Robert Haas
Date:
Subject: Re: empty backup_label