Fwd: libpq compression (part 3) - Mailing list pgsql-hackers

From Jacob Burroughs
Subject Fwd: libpq compression (part 3)
Date
Msg-id CACzsqT4-E4nYvS7fErmEsOcPVZTuGq4_y9C9iW7g=bwFfS3jgg@mail.gmail.com
Whole thread Raw
In response to libpq compression (part 3)  (Jacob Burroughs <jburroughs@instructure.com>)
Responses Re: libpq compression (part 3)
List pgsql-hackers
On Tue, May 14, 2024 at 3:24 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> IMHO, that's a HUGE improvement. But:
>
> * I would probably change is the name "libpq_compression", because
> even though we have src/backend/libpq, we typically use libpq to refer
> to the client library, not the server's implementation of the wire
> protocol. I think we could call it connection_encryption or
> wire_protocol_encryption or something like that, but I'm not a huge
> fan of libpq_compression.
>
I think connection_compression would seem like a good name to me.

> * I would use commas, not semicolons, to separate items in a list,
> i.e. lzma,gzip not lzma;gzip. I think that convention is nearly
> universal in PostgreSQL, but feel free to point out counterexamples if
> you were modelling this on something.
>
> * libpq_compression=lzma:client_to_server=off;gzip:server_to_client=off
> reads strangely to me. How about making it so that the syntax is like
> this:
>
>
libpq_compression=DEFAULT_VALUE_FOR_BOTH_DIRECTIONS:client_to_server=OVERRIDE_FOR_THIS_DIRECTION:servert_to_client=OVERRIDE_FOR_THIS_DIRECTION
>
> With all components being optional. So this example could be written
> in any of these ways:
>
> libpq_compression=lzma;server_to_client=gzip
> libpq_compression=gzip;client_to_server=lzma
> libpq_compression=server_to_client=gzip;client_to_server=lzma
> libpq_compression=client_to_server=lzma;client_to_server=gzip
>
> And if I wrote libpq_compression=server_to_client=gzip that would mean
> send data to the client using gzip and in the other direction use
> whatever the default is.

The reason for both the semicolons and for not doing this is related
to using the same specification structure as here:
https://www.postgresql.org/docs/current/app-pgbasebackup.html
(specifically the --compress argument). Reusing that specification
requires that we use commas to separate the flags for a compression
method, and therefore left me with semicolons as the leftover
separator character. I think we could go with something like your
proposal, and in a lot of ways I like it, but there's still the
possibility of e.g.
`libpq_compression=client_to_server=zstd:level=10,long=true,gzip;client_to_server=gzip`
and I'm not quite sure how to make the separator characters work
coherently if we need to treat `zstd:level=10,long=true` as a unit.
Alternatively, we could have `connection_compression`,
`connection_compression_server_to_client`, and
`connection_compression_client_to_server` as three separate GUCs (and
on the client side `compression`, `compression_server_to_client`, and
`compression_client_to_server` as three separate connection
parameters), where we would treat `connection_compression` as a
default that could be overridden by an explicit
client_to_server/server_to_client.  That creates the slightly funky
case where if you specify all three then the base one ends up unused
because the two more specific ones are being used instead, but that
isn't necessarily terrible.  On the server side we *could* go with
just the server_to_client and client_to_server ones, but I think we
want it to be easy to use this feature in the simple case with a
single libpq parameter.

--
Jacob Burroughs | Staff Software Engineer
E: jburroughs@instructure.com



pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Next
From: Cary Huang
Date:
Subject: Re: Support tid range scan in parallel?