Re: libpq compression (part 3) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: libpq compression (part 3)
Date
Msg-id CA+Tgmobuj67e8z=6AdnrStvOtD2QSeTCRthKpjgeNkwJTa+kLQ@mail.gmail.com
Whole thread Raw
In response to Re: libpq compression (part 3)  (Jacob Burroughs <jburroughs@instructure.com>)
Responses Re: libpq compression (part 3)
List pgsql-hackers
On Wed, May 15, 2024 at 12:50 PM Jacob Burroughs
<jburroughs@instructure.com> wrote:
> I think I would agree with that.  That said, I don't think the client
> should be in the business of specifying what configuration of the
> compression algorithm the server should use.  The server administrator
> (or really most of the time, the compression library developer's
> defaults) gets to pick the compression/compute tradeoff for
> compression that runs on the server (which I would imagine would be
> the vast majority of it), and the client gets to pick those same
> parameters for any compression that runs on the client machine
> (probably indeed in practice only for large COPYs).  The *algorithm*
> needs to actually be communicated/negotiated since different
> client/server pairs may be built with support for different
> compression libraries, but I think it is reasonable to say that the
> side that actually has to do the computationally expensive part owns
> the configuration of that part too.  Maybe I'm missing a good reason
> that we want to allow clients to choose compression levels for the
> server though?

Well, I mean, I don't really know what the right answer is here, but
right now I can say pg_dump --compress=gzip to compress the dump with
gzip, or pg_dump --compress=gzip:9 to compress with gzip level 9. Now,
say that instead of compressing the output, I want to compress the
data sent to me over the connection. So I figure I should be able to
say pg_dump 'compress=gzip' or pg_dump 'compress=gzip:9'. I think you
want to let me do the first of those but not the second. But, to turn
your question on its head, what would be the reasoning behind such a
restriction?

Note also the precedent of pg_basebackup. I can say pg_basebackup
--compress=server-gzip:9 to ask the server to compress the backup with
gzip at level 9. In that case, what I request from the server changes
the actual output that I get, which is not the case here. Even so, I
don't really understand what the justification would be for refusing
to let the client ask for a specific compression level.

And on the flip side, I also don't understand why the server would
want to mandate a certain compression level. If compression is very
expensive for a certain algorithm when the level is above some
threshold X, we could have a GUC to limit the maximum level that the
client can request. But, given that the gzip compression level
defaults to 6 in every other context, why would the administrator of a
particular server want to say, well, the default for my server is 3 or
9 or whatever?

(This is of course all presuming you want to use gzip at all, which
you probably don't, because gzip is crazy slow. Use lz4 or zstd! But
it makes the point.)

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: First draft of PG 17 release notes
Next
From: Aleksander Alekseev
Date:
Subject: Re: Pre-Commitfest Party on StHighload conf