Home > mailing lists

Re: libpq compression - Mailing list pgsql-hackers

From	Denis Smirnov
Subject	Re: libpq compression
Date	December 17, 2020 16:39:50
Msg-id	CCAB1F57-A71D-4DEC-9A9C-F325B44BAC29@arenadata.io Whole thread Raw
In response to	Re: libpq compression (Daniil Zakhlystov <usernamedt@yandex-team.ru>)
Responses	Re: libpq compression
List	pgsql-hackers

Tree view

Hello all,

I’ve finally read the whole thread (it was huge). It is extremely sad that this patch hang without progress for such a
longtime. It seems that the main problem in discussion is that everyone has its own view what problems should be solve
withthis patch. Here are some of positions (not all of them): 

1. Add a compression for networks with a bad bandwidth (and make a patch as simple and maintainable as possible) -
author’sposition. 
2. Don’t change current network protocol and related code much.
3. Refactor compression API (and network compression as well)
4. Solve cloud provider’s problems: on demand buy network bandwidth with CPU utilisation and vice versa.

All of these requirements have a different nature and sometimes conflict with each other. Without clearly formed
requirementsthis patch would never be released. 

Anyway, I have rebased it to the current master branch, applied pgindent, tested on MacOS and fixed a MacOS specific
problemwith strcpy in build_compressors_list(): it has an undefined behaviour when source and destination strings
overlap.
-    *client_compressors = src = dst = strdup(value);
+    *client_compressors = src = strdup(value);
+    dst = strdup(value);

According to my very simple tests with randomly generated data, zstd gives about 3x compression (zlib has a little
worsecompression ratio and a little bigger CPU utilisation). It seems to be a normal ratio for any streaming data -
Greenplumalso uses zstd/zlib to compress append optimised tables and compression ratio is usually about 3-5x. Also
accordingto my Greenplum experience, the most commonly used zstd ratio is 1, while for zlib it is usually in a range of
1-5.CPU and execution time were not affected much according to uncompressed data (but my tests were very simple and
theyshould not be treated as reliable). 



Best regards,
Denis Smirnov | Developer
sd@arenadata.io 
Arenadata | Godovikova 9-17, Moscow 129085 Russia

Attachment

0001-Rebase-patch-27-to-actual-master-and-fix-strcpy.patch.txt

pgsql-hackers by date:

From: Konstantin Knizhnik
Date: 17 December 2020, 16:05:18
Subject: Re: On login trigger: take three

From: Seino Yuki
Date: 17 December 2020, 16:59:20
Subject: Re: Feature improvement for pg_stat_statements

Re: libpq compression - Mailing list pgsql-hackers

Attachment

Previous

Next