Re: libpq compression - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: libpq compression |
Date | |
Msg-id | 739641a6-f64d-0f2b-8483-34ab23e2dcd6@2ndquadrant.com Whole thread Raw |
In response to | Re: libpq compression (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Responses |
Re: libpq compression
|
List | pgsql-hackers |
On 2/15/19 3:03 PM, Konstantin Knizhnik wrote: > > > On 15.02.2019 15:42, Peter Eisentraut wrote: >> On 2018-06-19 09:54, Konstantin Knizhnik wrote: >>> The main drawback of streaming compression is that you can not >>> decompress some particular message without decompression of all previous >>> messages. >> It seems this would have an adverse effect on protocol-aware connection >> proxies: They would have to uncompress everything coming in and >> recompress everything going out. >> >> The alternative of compressing each packet individually would work much >> better: A connection proxy could peek into the packet header and only >> uncompress the (few, small) packets that it needs for state and routing. >> > Individual compression of each message depreciate all idea of libpq > compression. Messages are two small to efficiently compress each of > them separately. So using streaming compression algorithm is > absolutely necessary here. > Hmmm, I see Peter was talking about "packets" while you're talking about "messages". Are you talking about the same thing? Anyway, I was going to write about the same thing - that per-message compression would likely eliminate most of the benefits - but I'm wondering if it's actually true. That is, how much will the compression ratio drop if we compress individual messages? Obviously, if there are just tiny messages, it might easily eliminate any benefits (and in fact it would add overhead). But I'd say we're way more interested in transferring large data sets (result sets, data for copy, etc.) and presumably those messages are much larger. So maybe we could compress just those, somehow? > Concerning possible problem with proxies I do not think that it is > really a problem. > Proxy is very rarely located somewhere in the "middle" between client > and database servers. > It is usually launched either in the same network as DBMS client (for > example, if client is application server) either in the same network > with database server. > In both cases there is not so much sense to pass compressed traffic > through the proxy. > If proxy and DBMS server are located in the same network, then proxy > should perform decompression and send > decompressed messages to the database server. > I don't think that's entirely true. It makes perfect sense to pass compressed traffic in various situations - even in local network the network bandwidth matters quite a lot, these days. Because "local network" may be "same availability zone" or "same DC" etc. That being said, I'm not sure it's a big deal / issue when the proxy has to deal with compression. Either it's fine to forward decompressed data, so the proxy performs just decompression, which requires much less CPU. (It probably needs to compress data in the opposite direction, but it's usually quite asymmetric - much more data is sent in one direction). Or the data has to be recompressed, because it saves enough network bandwidth. It's essentially a trade-off between using CPU and network bandwidth. IMHO it'd be nonsense to adopt the per-message compression based merely on the fact that it might be easier to handle on proxies. We need to know if we can get reasonable compression ratio with that approach, because if not then it's useless that it's more proxy-friendly. Do the proxies actually need to recompress the data? Can't they just decompress it to determine which messages are in the data, and then forward the original compressed stream? That would be much cheaper, because decompression requires much less CPU. Although, some proxies (like connection pools) probably have to compress the connections independently ... > Thank you very much for noticing this problem with compatibility > compression and protocol-aware connection proxies. > I have wrote that current compression implementation (zpq_stream.c) can > be used not only for libpq backend/frontend, but > also for compression any other streaming data. But I could not imaging > what other data sources can require compression. > And proxy is exactly such case: it also needs to compress/decompress > messages. > It is one more argument to make interface of zpq_stream as simple as > possible and encapsulate all inflating/deflating logic in this code. > It can be achieved by passing arbitrary rx/tx function to zpq_create > function. > regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: