Re: libpq compression - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: libpq compression
Date
Msg-id d58ebe47-5662-a50f-117d-4c1049acbfef@postgrespro.ru
Whole thread Raw
In response to Re: libpq compression  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Re: libpq compression  (David Steele <david@pgmasters.net>)
List pgsql-hackers

On 15.02.2019 18:26, Tomas Vondra wrote:
>
>
> On 2/15/19 3:03 PM, Konstantin Knizhnik wrote:
>>
>> On 15.02.2019 15:42, Peter Eisentraut wrote:
>>> On 2018-06-19 09:54, Konstantin Knizhnik wrote:
>>>> The main drawback of streaming compression is that you can not
>>>> decompress some particular message without decompression of all previous
>>>> messages.
>>> It seems this would have an adverse effect on protocol-aware connection
>>> proxies: They would have to uncompress everything coming in and
>>> recompress everything going out.
>>>
>>> The alternative of compressing each packet individually would work much
>>> better: A connection proxy could peek into the packet header and only
>>> uncompress the (few, small) packets that it needs for state and routing.
>>>
>> Individual compression of each message depreciate all idea of libpq
>> compression. Messages are two small to efficiently compress each of
>> them separately. So using streaming compression algorithm is
>> absolutely necessary here.
>>
> Hmmm, I see Peter was talking about "packets" while you're talking about
> "messages". Are you talking about the same thing?
Sorry, but there are no "packet" in libpq protocol, so I assumed that 
packet=message.
In any protocol-aware proxy has to proceed each message.

>
> Anyway, I was going to write about the same thing - that per-message
> compression would likely eliminate most of the benefits - but I'm
> wondering if it's actually true. That is, how much will the compression
> ratio drop if we compress individual messages?
Compression of small messages without shared dictionary will give awful 
results.
Assume that average record and so message size is 100 bytes.
Just perform very simple experiment create file with 100 equal 
characters and try to compress it.
With zlib result will be 173 bytes. So after "compression" size of file 
is increase 1.7 times.
This is why there is no other way to efficiently compress libpq traffic 
without usage of streaming compression
(when dictionary is  shared and updated for all messages).
>
> Obviously, if there are just tiny messages, it might easily eliminate
> any benefits (and in fact it would add overhead). But I'd say we're way
> more interested in transferring large data sets (result sets, data for
> copy, etc.) and presumably those messages are much larger. So maybe we
> could compress just those, somehow?
Please notice that copy stream consists of individual messages for each 
record.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: libpq compression
Next
From: Tomas Vondra
Date:
Subject: Re: Protect syscache from bloating with negative cache entries