Re: libpq compression (part 3) - Mailing list pgsql-hackers

From Jacob Burroughs
Subject Re: libpq compression (part 3)
Date
Msg-id CACzsqT5hYVUesYW7_zGB75LAj4f2BHT5GKiprJdCnhSJ=f8srQ@mail.gmail.com
Whole thread Raw
In response to Re: libpq compression (part 3)  (Jacob Champion <jacob.champion@enterprisedb.com>)
Responses Re: libpq compression (part 3)
List pgsql-hackers
On Mon, May 20, 2024 at 2:42 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
>
> I mean... you said it, not me. I'm trying not to rain on the parade
> too much, because compression is clearly very valuable. But it makes
> me really uncomfortable that we're reintroducing the compression
> oracle (especially over the authentication exchange, which is
> generally more secret than the rest of the traffic).

As currently implemented, the compression only applies to
CopyData/DataRow/Query messages, none of which should be involved in
authentication, unless I've really missed something in my
understanding.

> Right, I think it's reasonable to let a sufficiently
> determined/informed user lift the guardrails, but first we have to
> choose to put guardrails in place... and then we have to somehow
> sufficiently inform the users when it's okay to lift them.

My thought would be that compression should be opt-in on the client
side, with documentation around the potential security pitfalls. (I
could be convinced it should be opt-in on the server side, but overall
I think opt-in on the client side generally protects against footguns
without excessively getting in the way and if an attacker controls the
client, they can just get the information they want directly-they
don't need compression sidechannels to get that information.)

> But for SQL, where's the dividing line between attacker-chosen and
> attacker-sought? To me, it seems like only the user knows; the server
> has no clue. I think that puts us "lower" in Alyssa's model than HTTP
> is.
>
> As Andrey points out, there was prior work done that started to take
> this into account. I haven't reviewed it to see how good it is -- and
> I think there are probably many use cases in which queries and tables
> contain both private and attacker-controlled information -- but if we
> agree that they have to be separated, then the strategy can at least
> be improved upon.

Within SQL-level things, I don't think we can reasonably differentiate
between private and attacker-controlled information at the
libpq/server level.  We can reasonably differentiate between message
types that *definitely* are private and ones that could have
either/both data in them, but that's not nearly as useful.  I think
not compressing auth-related packets plus giving a mechanism to reset
the compression stream for clients (plus guidance on the tradeoffs
involved in turning on compression) is about as good as we can get.
That said, I *think* the feature is reasonable to be
reviewed/committed without the reset functionality as long as the
compressed data already has the mechanism built in (as it does) to
signal when a decompressor should restart its streaming.  The actual
signaling protocol mechanism/necessary libpq API can happen in
followon work.


--
Jacob Burroughs | Staff Software Engineer
E: jburroughs@instructure.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Possible Bug in relation_open
Next
From: Dave Page
Date:
Subject: Re: zlib detection in Meson on Windows broken?