Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | 20190708183944.GB29202@tamriel.snowman.net Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
List | pgsql-hackers |
Greetings, * Bruce Momjian (bruce@momjian.us) wrote: > On Mon, Jul 8, 2019 at 11:47:33AM -0400, Stephen Frost wrote: > > * Bruce Momjian (bruce@momjian.us) wrote: > > > On Mon, Jul 8, 2019 at 11:18:01AM -0400, Joe Conway wrote: > > > > On 7/8/19 10:19 AM, Bruce Momjian wrote: > > > > > When people are asking for multiple keys (not just for key rotation), > > > > > they are asking to have multiple keys that can be supplied by users only > > > > > when they need to access the data. Yes, the keys are always in the > > > > > datbase, but the feature request is that they are only unlocked when the > > > > > user needs to access the data. Obviously, that will not work for > > > > > autovacuum when the encryption is at the block level. > > > > > > > > > If the key is always unlocked, there is questionable security value of > > > > > having multiple keys, beyond key rotation. > > > > > > > > That is not true. Having multiple keys also allows you to reduce the > > > > amount of data encrypted with a single key, which is desirable because: > > > > > > > > 1. It makes cryptanalysis more difficult > > > > 2. Puts less data at risk if someone gets "lucky" in doing brute force > > > > > > What systems use multiple keys like that? I know of no website that > > > does that. Your arguments seem hypothetical. What is your goal here? > > > > Not sure what the reference to 'website' is here, but one doesn't get > > certificates for TLS/SSL usage that aren't time-bounded, and when it > > comes to the actual on-the-wire encryption that's used, that's a > > symmetric key that's generated on-the-fly for every connection. > > > > Wouldn't the fact that they generate a different key for every > > connection be a pretty clear indication that it's a good idea to use > > multiple keys and not use the same key over and over..? > > > > Of course, we can discuss if what websites do with over-the-wire > > encryption is sensible to compare to what we want to do in PG for > > data-at-rest, but then we shouldn't be talking about what websites do, > > it'd make more sense to look at other data-at-rest encryption systems > > and consider what they're doing. > > (I talked to Joe on chat for clarity.) In modern TLS, the certificate is > used only for authentication, and Diffie–Hellman is used for key > exchange: > > https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange Right, and the key that's figured out for each connection is at least specific to the server AND client keys/certificates, thus meaning that they're changed at least as frequently as those change (and clients end up creating ones on the fly randomly if they don't have one, iirc). > So, the question is whether you can pass so much data in TLS that using > the same key for the entire session is a security issue. TLS originally > had key renegotiation, but that was removed in TLS 1.3: > > https://www.cloudinsidr.com/content/known-attack-vectors-against-tls-implementation-vulnerabilities/ > To mitigate these types of attacks, TLS 1.3 disallows renegotiation. It was removed due to attacks targeting the renegotiation, not because doing re-keying by itself was a bad idea, or because using multiple keys was seen as a bad idea. > Of course, a database is going to process even more data so if the > amount of data encrypted is a problem, we might have a problem too in > using a single key. This is not related to whether we use one key for > the entire cluster or multiple keys per tablespace --- the problem is > the same. I guess we could create 1024 keys and use the bottom bits of > the block number to decide what key to use. However, that still only > pushes the goalposts farther. All of this is about pushing the goalposts farther away, as I see it. There's going to be trade-offs here and there isn't going to be any "one right answer" when it comes to this space. That's why I'm inclined to argue that we should try to come up with a relatively *good* solution that doesn't create a huge amount of work for us, and then build on that. To that end, leveraging metadata that we already have outside of the catalogs (databases, tablespaces, potentially other information that we store, essentially, in the filesystem metadata already) to decide on what key to use, and how many we can support, strikes me as a good initial target. > Anyway, I will to research the reasonable data size that can be secured > with a single key via AES. I will look at how PGP encrypts large files > too. This seems unlikely to lead to a definitive result, but it would be interesting to hear if there have been studies around that and what their conclusions were. When it comes to concerns about autovacuum or other system processes, those don't have any direct user connections or interactions, so having them be more privileged and having access to more is reasonable. Ideally, all of this would leverage a vaulting system or other mechanism which manages access to the keys and allows their usage to be limited. That's been generally accepted as a good way to bridge the gap between having to ask users every time for a key and having keys stored long-term in memory. Having *only* the keys for the data which the currently connected user is allowed to access would certainly be a great initial capability, even if system processes (including potentially WAL replay) have to have access to all of the keys. And yes, shared buffers being unencrypted and accessible by every backend continues to be an issue- it'd be great to improve on that situation too. I don't think having everything encrypted in shared buffers is likely the solution, rather, segregating it up might make more sense, again, along similar lines to keys and using metadata that's outside of the catalogs, which has been discussed previously, though I don't think anyone's actively working on it. Thanks, Stephen
Attachment
pgsql-hackers by date: