Re: libpq compression - Mailing list pgsql-hackers

From Daniil Zakhlystov
Subject Re: libpq compression
Date
Msg-id 6811D196-E2FB-40CC-B1C9-F19427FBF675@yandex-team.ru
Whole thread Raw
In response to Re: libpq compression  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: libpq compression  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

> On Nov 24, 2020, at 11:35 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> So the time to talk about the
> general approach here is now, before anything gets committed, before
> the project has committed itself to any particular design. If we
> decide in that discussion that certain things can be left for the
> future, that's fine. If we've have discussed how they could be added
> without breaking backward compatibility, even better. But we can't
> just skip over having that discussion.

> If the client requests compression and the server supports it, it
> should return a new SupportedCompressionTypes message following
> NegotiateProtocolMessage response. That should be a list of
> compression methods which the server understands. At this point, the
> clent and the server each know what methods the other understands.
> Each should now feel free to select a compression method the other
> side understands, and to switch methods whenever desired, as long as
> they only select from methods the other side has said that they
> understand. The patch seems to think that the compression method has
> to be the same in both directions and that it can never change, but
> there's no real reason for that. Let each side start out uncompressed
> and then let it issue a new SetCompressionMethod protocol message to
> switch the compression method whenever it wants. After sending that
> message it begins using the new compression type. The other side
> doesn't have to agree. That way, you don't have to worry about
> synchronizing the two directions. Each side is just telling the other
> what is choosing to do, from among the options the other side said it
> could understand.

I’ve read your suggestions about the switchable on-the-fly independent for each direction compression.

While the proposed protocol seems straightforward, the ability to switch compression mode in an arbitrary moment
significantlycomplexifies the implementation which may lead to the lower adoption of the really useful feature in
customfrontends/backends. 

However, I don’t mean by this that we shouldn’t support switchable compression. I propose that we can offer two
compressionmodes: permanent (which is implemented in the current state of the patch) and switchable on-the-fly.
Permanentcompression allows us to deliver a robust solution that is already present in some databases. Switchable
compressionallows us to support more complex scenarios in cases when the frontend and backend really need it and can
afforddevelopment effort to implement it. 

I’ve made a draft of the protocol that may cover both these compression modes, also the following protocol supports
independentfrontend and backend compression. 

In StartupPacket _pq_.compression frontend will specify the:

1. Supported compression modes in the order of preference.
For example: “permanent, switchable” means that the frontend supports both permanent and switchable modes and prefer to
usethe permanent mode. 

2. List of the compression algorithms which the frontend is able to decompress in the order of preference.
For example:
“zlib:1,3,5;zstd:7,8;uncompressed” means that frontend is able to:
 - decompress zlib with 1,3 or 5 compression levels
 - decompress zstd with 7 or 8 compression levels
 - “uncompressed” at the end means that the frontend agrees to receive uncompressed messages. If there is no
“uncompressed”compression algorithm specified it means that the compression is required.  

After receiving the StartupPacket message from the frontend, the backend will either ignore the _pq_.compression as an
unknownparameter (if the backend is before November 2017) or respond with the CompressionAck message which will
include:

1. Index of the chosen compression mode or -1 if doesn’t support any of the compression modes send by the frontend.
In the case of the startup packet from the previous example:
It may be ‘0’ if the server chose permanent mode,’1’ if switchable, or ‘-1’ if the server doesn’t support any of these.

2. List of the compression algorithms which the backend is able to decompress in the order of preference.
For example, “zstd:2,4;uncompressed;zlib:7” means that the backend is able to:
-decompress zstd with 2 and 4 compression levels
-work in uncompressed mode
-decompress zlib with compression level 7

After sending the CompressionAck message, the backend will also send the SetCompressionMessage with one of the
following:
 - Index of the chosen backend compression algorithm followed by the index of the chosen compression level. In this
case,the frontend now should use the chosen decompressor for incoming messages, the backend should also use the chosen
compressorfor outgoing messages.  
 - '-1', if the backend doesn’t support the compression using any of the algorithms sent by the frontend. In this case,
thefrontend must terminate the connection after receiving this message. 

After receiving the SetCompressionMessage from the backend, the frontend should also reply with SetCompressionMessage
withone of the following: 
 - Index of the chosen frontend compression algorithm followed by the index of the chosen compression level. In this
case,the backend now should use the chosen decompressor for incoming messages, the frontend should also use the chosen
compressorfor outgoing messages. 
 - '-1', if the frontend doesn’t support the compression using any of the algorithms sent by the backend. In this case,
thefrontend should terminate the connection after sending this message. 

After that sequence of messages, the frontend and backend may continue the usual conversation. In the case of permanent
compressionmode, further use of SetCompressionMessage is prohibited both on the frontend and backend sites.  
Supported compression and decompression methods are configured using GUC parameters:

compress_algorithms = ‘...’ // default value is ‘uncompressed’
decompress_algorithms = ‘...’ // default value is ‘uncompressed’

Please, let me know if I was unclear somewhere in the protocol description so I can clarify the things that I might
havemissed. I would appreciate hearing your opinion on the proposed protocol.  

Thanks,

Daniil Zakhlystov



pgsql-hackers by date:

Previous
From: Euler Taveira
Date:
Subject: Re: cleanup temporary files after crash
Next
From: Alvaro Herrera
Date:
Subject: Re: Improper use about DatumGetInt32