Home > mailing lists

Re: libpq compression - Mailing list pgsql-hackers

From	Konstantin Knizhnik
Subject	Re: libpq compression
Date	February 15, 2019 15:40:37
Msg-id	d58ebe47-5662-a50f-117d-4c1049acbfef@postgrespro.ru Whole thread Raw
In response to	Re: libpq compression (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses	Re: Re: libpq compression
List	pgsql-hackers

Tree view


On 15.02.2019 18:26, Tomas Vondra wrote:
>
>
> On 2/15/19 3:03 PM, Konstantin Knizhnik wrote:
>>
>> On 15.02.2019 15:42, Peter Eisentraut wrote:
>>> On 2018-06-19 09:54, Konstantin Knizhnik wrote:
>>>> The main drawback of streaming compression is that you can not
>>>> decompress some particular message without decompression of all previous
>>>> messages.
>>> It seems this would have an adverse effect on protocol-aware connection
>>> proxies: They would have to uncompress everything coming in and
>>> recompress everything going out.
>>>
>>> The alternative of compressing each packet individually would work much
>>> better: A connection proxy could peek into the packet header and only
>>> uncompress the (few, small) packets that it needs for state and routing.
>>>
>> Individual compression of each message depreciate all idea of libpq
>> compression. Messages are two small to efficiently compress each of
>> them separately. So using streaming compression algorithm is
>> absolutely necessary here.
>>
> Hmmm, I see Peter was talking about "packets" while you're talking about
> "messages". Are you talking about the same thing?
Sorry, but there are no "packet" in libpq protocol, so I assumed that 
packet=message.
In any protocol-aware proxy has to proceed each message.

>
> Anyway, I was going to write about the same thing - that per-message
> compression would likely eliminate most of the benefits - but I'm
> wondering if it's actually true. That is, how much will the compression
> ratio drop if we compress individual messages?
Compression of small messages without shared dictionary will give awful 
results.
Assume that average record and so message size is 100 bytes.
Just perform very simple experiment create file with 100 equal 
characters and try to compress it.
With zlib result will be 173 bytes. So after "compression" size of file 
is increase 1.7 times.
This is why there is no other way to efficiently compress libpq traffic 
without usage of streaming compression
(when dictionary is  shared and updated for all messages).
>
> Obviously, if there are just tiny messages, it might easily eliminate
> any benefits (and in fact it would add overhead). But I'd say we're way
> more interested in transferring large data sets (result sets, data for
> copy, etc.) and presumably those messages are much larger. So maybe we
> could compress just those, somehow?
Please notice that copy stream consists of individual messages for each 
record.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

From: Tomas Vondra
Date: 15 February 2019, 15:26:13
Subject: Re: libpq compression

From: Tomas Vondra
Date: 15 February 2019, 15:44:27
Subject: Re: Protect syscache from bloating with negative cache entries

Re: libpq compression - Mailing list pgsql-hackers

Previous

Next