Re: libpq compression - Mailing list pgsql-hackers

From Denis Smirnov
Subject Re: libpq compression
Date
Msg-id CCAB1F57-A71D-4DEC-9A9C-F325B44BAC29@arenadata.io
Whole thread Raw
In response to Re: libpq compression  (Daniil Zakhlystov <usernamedt@yandex-team.ru>)
Responses Re: libpq compression
List pgsql-hackers
Hello all,

I’ve finally read the whole thread (it was huge). It is extremely sad that this patch hang without progress for such a
longtime. It seems that the main problem in discussion is that everyone has its own view what problems should be solve
withthis patch. Here are some of positions (not all of them): 

1. Add a compression for networks with a bad bandwidth (and make a patch as simple and maintainable as possible) -
author’sposition. 
2. Don’t change current network protocol and related code much.
3. Refactor compression API (and network compression as well)
4. Solve cloud provider’s problems: on demand buy network bandwidth with CPU utilisation and vice versa.

All of these requirements have a different nature and sometimes conflict with each other. Without clearly formed
requirementsthis patch would never be released. 

Anyway, I have rebased it to the current master branch, applied pgindent, tested on MacOS and fixed a MacOS specific
problemwith strcpy in build_compressors_list(): it has an undefined behaviour when source and destination strings
overlap.
-    *client_compressors = src = dst = strdup(value);
+    *client_compressors = src = strdup(value);
+    dst = strdup(value);

According to my very simple tests with randomly generated data, zstd gives about 3x compression (zlib has a little
worsecompression ratio and a little bigger CPU utilisation). It seems to be a normal ratio for any streaming data -
Greenplumalso uses zstd/zlib to compress append optimised tables and compression ratio is usually about 3-5x. Also
accordingto my Greenplum experience, the most commonly used zstd ratio is 1, while for zlib it is usually in a range of
1-5.CPU and execution time were not affected much according to uncompressed data (but my tests were very simple and
theyshould not be treated as reliable). 



Best regards,
Denis Smirnov | Developer
sd@arenadata.io 
Arenadata | Godovikova 9-17, Moscow 129085 Russia


Attachment

pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: On login trigger: take three
Next
From: Seino Yuki
Date:
Subject: Re: Feature improvement for pg_stat_statements