Re: libpq compression (part 3) - Mailing list pgsql-hackers
From | Jelte Fennema-Nio |
---|---|
Subject | Re: libpq compression (part 3) |
Date | |
Msg-id | CAGECzQTMEry3HOqn_ajOZU4BnwAV1L67iSL9+2NQpwyFxm12tA@mail.gmail.com Whole thread Raw |
In response to | Re: libpq compression (part 3) (Jacob Champion <jacob.champion@enterprisedb.com>) |
Responses |
Re: libpq compression (part 3)
|
List | pgsql-hackers |
On Mon, 20 May 2024 at 21:42, Jacob Champion <jacob.champion@enterprisedb.com> wrote: > As Andrey points out, there was prior work done that started to take > this into account. I haven't reviewed it to see how good it is -- and > I think there are probably many use cases in which queries and tables > contain both private and attacker-controlled information -- but if we > agree that they have to be separated, then the strategy can at least > be improved upon. To help get everyone on the same page I wanted to list all the security concerns in one place: 1. Triggering excessive CPU usage before authentication, by asking for very high compression levels 2. Triggering excessive memory/CPU usage before authentication, by sending a client sending a zipbomb 3. Triggering excessive CPU after authentication, by asking for a very high compression level 4. Triggering excessive memory/CPU after authentication due to zipbombs (i.e. small amount of data extracting to lots of data) 5. CRIME style leakage of information about encrypted data 1 & 2 can easily be solved by not allowing any authentication packets to be compressed. This also has benefits for 5. 3 & 4 are less of a concern than 1&2 imho. Once authenticated a client deserves some level of trust. But having knobs to limit impact definitely seems useful. 3 can be solved in two ways afaict: a. Allow the server to choose the maximum compression level for each compression method (using some GUC), and downgrade the level transparently when a higher level is requested b. Don't allow the client to choose the compression level that the server uses. I'd prefer option a 4 would require some safety limits on the amount of data that a (small) compressed message can be decompressed to, and stop decompression of that message once that limit is hit. What that limit should be seems hard to choose though. A few ideas: a. The size of the message reported by the uncompressed header. This would mean that at most the 4GB will be uncompressed, since maximum message length is 4GB (limited by 32bit message length field) b. Allow servers to specify maximum client decompressed message length lower than this 4GB, e.g. messages of more than 100MB of uncompressed size should not be allowed. I think 5 is the most complicated to deal with, especially as it depends on the actual usage to know what is safe. I believe we should let users have the freedom to make their own security tradeoffs, but we should protect them against some of the most glaring issues (especially ones that benefit little from compression anyway). As already shown by Andrey, sending LDAP passwords in a compressed way seems extremely dangerous. So I think we should disallow compressing any authentication related packets. To reduce similar risks further we can choose to compress only the message types that we expect to benefit most from compression. IMHO those are the following (marked with (B)ackend or (F)rontend to show who sends them): - Query (F) - Parse (F) - Describe (F) - Bind (F) - RowDescription (B) - DataRow (B) - CopyData (B/F) Then I think we should let users choose how they want to compress and where they want their compression stream to restart. Something like this: a. compression_restart=query: Restart the stream after every query. Recommended if queries across the same connection are triggered by different end-users. I think this would be a sane default b. compression_restart=message: Restart the stream for every message. Recommended if the amount of correlation between rows of the same query is a security concern. c. compression_restart=manual: Don't restart the stream automatically, but only when the client user calls a specific function. Recommended only if the user can make trade-offs, or if no encryption is used anyway.
pgsql-hackers by date: