Re: libpq compression (part 3) - Mailing list pgsql-hackers

From Jacob Burroughs
Subject Re: libpq compression (part 3)
Date
Msg-id CACzsqT6mhei=xiNwQ2JbVGs44bPg+S5KDit1LmiBzdbo=4NSLg@mail.gmail.com
Whole thread Raw
In response to Re: libpq compression (part 3)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: libpq compression (part 3)
Re: libpq compression (part 3)
List pgsql-hackers
On Thu, May 16, 2024 at 3:28 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> Well, I mean, I don't really know what the right answer is here, but
> right now I can say pg_dump --compress=gzip to compress the dump with
> gzip, or pg_dump --compress=gzip:9 to compress with gzip level 9. Now,
> say that instead of compressing the output, I want to compress the
> data sent to me over the connection. So I figure I should be able to
> say pg_dump 'compress=gzip' or pg_dump 'compress=gzip:9'. I think you
> want to let me do the first of those but not the second. But, to turn
> your question on its head, what would be the reasoning behind such a
> restriction?

I think I was more thinking that trying to let both parties control
the parameter seemed like a recipe for confusion and sadness, and so
the choice that felt most natural to me was to let the sender control
it, but I'm definitely open to changing that the other way around.

> Note also the precedent of pg_basebackup. I can say pg_basebackup
> --compress=server-gzip:9 to ask the server to compress the backup with
> gzip at level 9. In that case, what I request from the server changes
> the actual output that I get, which is not the case here. Even so, I
> don't really understand what the justification would be for refusing
> to let the client ask for a specific compression level.
>
> And on the flip side, I also don't understand why the server would
> want to mandate a certain compression level. If compression is very
> expensive for a certain algorithm when the level is above some
> threshold X, we could have a GUC to limit the maximum level that the
> client can request. But, given that the gzip compression level
> defaults to 6 in every other context, why would the administrator of a
> particular server want to say, well, the default for my server is 3 or
> 9 or whatever?
>
> (This is of course all presuming you want to use gzip at all, which
> you probably don't, because gzip is crazy slow. Use lz4 or zstd! But
> it makes the point.)

New proposal, predicated on the assumption that if you enable
compression you are ok with the client picking whatever level they
want.  At least with the currently enabled algorithms I don't think
any of them are so insane that they would knock over a server or
anything, and in general postgres servers are usually connected to by
clients that the server admin has some channel to talk to (after all
they somehow had to get access to log in to the server in the first
place) if they are doing something wasteful, given that a client can
do a lot worse things than enable aggressive compression by writing
bad queries.

On the server side, we use slash separated sets of options

connection_compression=DEFAULT_VALUE_FOR_BOTH_DIRECTIONS/client_to_server=OVERRIDE_FOR_THIS_DIRECTION/server_to_client=OVERRIDE_FOR_THIS_DIRECTION
with the values being semicolon separated compression algorithms.
On the client side, you can specify
compression=<same_specification_as_above>,
but on the client side you can actually specify compression options,
which the server will use if provided, and otherwise it will fall back
to defaults.

If we think we need to, we could let the server specify defaults for
server-side compression.  My overall thought though is that having an
excessive number of knobs increases the surface area for testing and
bugs while also increasing potential user confusion and that allowing
configuration on *both* sides doesn't seem sufficiently useful to be
worth adding that complexity.

--
Jacob Burroughs | Staff Software Engineer
E: jburroughs@instructure.com



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: First draft of PG 17 release notes
Next
From: Tom Lane
Date:
Subject: Re: Speed up clean meson builds by ~25%