Re: libpq compression (part 3) - Mailing list pgsql-hackers
From | Jacob Burroughs |
---|---|
Subject | Re: libpq compression (part 3) |
Date | |
Msg-id | CACzsqT6mhei=xiNwQ2JbVGs44bPg+S5KDit1LmiBzdbo=4NSLg@mail.gmail.com Whole thread Raw |
In response to | Re: libpq compression (part 3) (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: libpq compression (part 3)
Re: libpq compression (part 3) |
List | pgsql-hackers |
On Thu, May 16, 2024 at 3:28 AM Robert Haas <robertmhaas@gmail.com> wrote: > > Well, I mean, I don't really know what the right answer is here, but > right now I can say pg_dump --compress=gzip to compress the dump with > gzip, or pg_dump --compress=gzip:9 to compress with gzip level 9. Now, > say that instead of compressing the output, I want to compress the > data sent to me over the connection. So I figure I should be able to > say pg_dump 'compress=gzip' or pg_dump 'compress=gzip:9'. I think you > want to let me do the first of those but not the second. But, to turn > your question on its head, what would be the reasoning behind such a > restriction? I think I was more thinking that trying to let both parties control the parameter seemed like a recipe for confusion and sadness, and so the choice that felt most natural to me was to let the sender control it, but I'm definitely open to changing that the other way around. > Note also the precedent of pg_basebackup. I can say pg_basebackup > --compress=server-gzip:9 to ask the server to compress the backup with > gzip at level 9. In that case, what I request from the server changes > the actual output that I get, which is not the case here. Even so, I > don't really understand what the justification would be for refusing > to let the client ask for a specific compression level. > > And on the flip side, I also don't understand why the server would > want to mandate a certain compression level. If compression is very > expensive for a certain algorithm when the level is above some > threshold X, we could have a GUC to limit the maximum level that the > client can request. But, given that the gzip compression level > defaults to 6 in every other context, why would the administrator of a > particular server want to say, well, the default for my server is 3 or > 9 or whatever? > > (This is of course all presuming you want to use gzip at all, which > you probably don't, because gzip is crazy slow. Use lz4 or zstd! But > it makes the point.) New proposal, predicated on the assumption that if you enable compression you are ok with the client picking whatever level they want. At least with the currently enabled algorithms I don't think any of them are so insane that they would knock over a server or anything, and in general postgres servers are usually connected to by clients that the server admin has some channel to talk to (after all they somehow had to get access to log in to the server in the first place) if they are doing something wasteful, given that a client can do a lot worse things than enable aggressive compression by writing bad queries. On the server side, we use slash separated sets of options connection_compression=DEFAULT_VALUE_FOR_BOTH_DIRECTIONS/client_to_server=OVERRIDE_FOR_THIS_DIRECTION/server_to_client=OVERRIDE_FOR_THIS_DIRECTION with the values being semicolon separated compression algorithms. On the client side, you can specify compression=<same_specification_as_above>, but on the client side you can actually specify compression options, which the server will use if provided, and otherwise it will fall back to defaults. If we think we need to, we could let the server specify defaults for server-side compression. My overall thought though is that having an excessive number of knobs increases the surface area for testing and bugs while also increasing potential user confusion and that allowing configuration on *both* sides doesn't seem sufficiently useful to be worth adding that complexity. -- Jacob Burroughs | Staff Software Engineer E: jburroughs@instructure.com
pgsql-hackers by date: