libpq compression (part 2) - Mailing list pgsql-hackers

From Daniil Zakhlystov
Subject libpq compression (part 2)
Date
Msg-id ABAA09C6-BB95-47A5-890D-90353533F9AC@yandex-team.ru
Whole thread Raw
Responses Re: libpq compression (part 2)
Re: libpq compression (part 2)
List pgsql-hackers
Hi!

I want to introduce the updated version of the libpq protocol compression patch, initially
introduced by Konstantin Knizhnik in this thread:
https://www.postgresql.org/message-id/aad16e41-b3f9-e89d-fa57-fb4c694bec25@postgrespro.ru

The original thread became huge and it makes it hard for new people to catch up so I think it is better
to open a new thread with the summary of the current state of the patch.

Compression of libpq traffic is useful in:
1. COPY
2. Replication
3. Queries returning large results sets (for example JSON) through slow connections.

The patch introduces three new protocol messages: CompressionAck, CompressedData, and
SetCompressionMethod.

Here is a brief overview of the compression initialization process:

1. Compression can be requested by a client by including the "compression" option in its connection
string. This can either be a boolean value to enable or
disable compression or an explicit list of comma-separated compression algorithms which can
optionally include compression level. The client indicates the compression request by sending the
_pq_.compression startup packet
parameter with a list of compression algorithms and an optional specification of compression level.
If the server does not support compression, the backend will ignore the _pq_.compression parameter
and will not send the CompressionAck message to the frontend.

2. Server receives the client's compression request and intersects the requested compression
algorithms with the allowed ones (controlled via the libpq_compression server config setting). If
the intersection is not empty, the server responds with CompressionAck containing the final list of
the compression algorithms that can be used for the compression of libpq messages between the client
and server. If the intersection is empty (server does not accept any of the requested algorithms),
then it replies with CompressionAck containing the empty list and it is up to the client whether to
continue without compression or to report an error.

3. After sending the CompressionAck message, the server can send the SetCompressionMethod message to
set the current compression algorithm for server-to-client traffic compression. Same for the client,
after receiving the CompressionAck message, the client can send the SetCompressionMethod message to set the current
compression algorithm for client-to-server traffic compression. Client-to-server and
server-to-client compression are independent of each other.

To compress messages, streaming compression is used. Compressed bytes are wrapped into the
CompressedData protocol messages. One CompressedData message may contain multiple regular protocol
messages. CompressedData messages can be mixed with the regular uncompressed messages.

Compression context is retained between the multiple CompressedData messages. Currently, only
CopyData, DataRow, and Query types of messages with length more than 60 bytes are being compressed.

If the client (or server) wants to switch the current compression method, it sends the
SetCompressionMethod message to the receiving side would be able to change its decompressor.

I've separated the patch into two parts: first contains the main patch with ZLIB and LZ4 compressing
algorithms support, second adds the ZSTD support.

Thanks,

Daniil Zakhlystov


Attachment

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: add checkpoint stats of snapshot and mapping files of pg_logical dir
Next
From: Bharath Rupireddy
Date:
Subject: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work