Re: zstd compression for pg_dump - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: zstd compression for pg_dump
Date
Msg-id E3868F55-750B-407A-8C15-6C790B5D4D77@yandex-team.ru
Whole thread Raw
In response to Re: zstd compression for pg_dump  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: zstd compression for pg_dump  (Justin Pryzby <pryzby@telsasoft.com>)
Re: zstd compression for pg_dump  (Daniil Zakhlystov <usernamedt@yandex-team.ru>)
List pgsql-hackers

> 4 янв. 2021 г., в 07:53, Justin Pryzby <pryzby@telsasoft.com> написал(а):
>
> Note, there's currently several "compression" patches in CF app.  This patch
> seems to be independent of the others, but probably shouldn't be totally
> uncoordinated (like adding lz4 in one and ztsd in another might be poor
> execution).
>
> https://commitfest.postgresql.org/31/2897/
> - Faster pglz compression
> https://commitfest.postgresql.org/31/2813/
> - custom compression methods for toast
> https://commitfest.postgresql.org/31/2773/
> - libpq compression

I think that's downside of our development system: patch authors do not want to create dependencies on other patches.
I'd say that both lz4 and zstd should be supported in TOAST, FPIs, libpq, and pg_dump. As to pglz - I think we should
notproliferate it any further. 
Lz4 and Zstd represent a different tradeoff actually. Basically, lz4 is so CPU-cheap that one should use it whenever
theywrite to disk or network interface. Zstd represent an actual bandwith\CPU tradeoff. 
Also, all patchsets do not touch important possibility - preexisting dictionary could radically improve compression of
smalldata (event in pglz). 

Some minor notes on patchset at this thread.

Libpq compression encountered some problems with memory consumption which required some extra config efforts. Did you
measurememory usage for this patchset? 

[PATCH 03/20] Support multiple compression algs/levels/opts..
abtracts -> abstracts
enum CompressionAlgorithm actually represent the very same thing as in "Custom compression methods"

Daniil, is levels definition compatible with libpq compression patch?
+typedef struct Compress {
+    CompressionAlgorithm    alg;
+    int            level;
+    /* Is a nondefault level set ?  This is useful since different compression
+     * methods have different "default" levels.  For now we assume the levels
+     * are all integer, though.
+    */
+    bool        level_set;
+} Compress;

[PATCH 04/20] struct compressLibs
I think this directive would be correct.
+// #ifdef HAVE_LIBZ?

Here's extra comment
// && errno == ENOENT)


[PATCH 06/20] pg_dump: zstd compression

I'd propose to build with Zstd by default. It seems other patches do it this way. Though, I there are possible
downsides.


Thanks for working on this! We will have very IO-efficient Postgres :)

Best regards, Andrey Borodin.


pgsql-hackers by date:

Previous
From: torikoshia
Date:
Subject: Re: adding wait_start column to pg_locks
Next
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Custom compression methods