Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Date
Msg-id 20190708211811.sio5o36zxhps7snx@momjian.us
Whole thread Raw
In response to Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Stephen Frost <sfrost@snowman.net>)
Responses Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Mon, Jul  8, 2019 at 02:39:44PM -0400, Stephen Frost wrote:
> > > Of course, we can discuss if what websites do with over-the-wire
> > > encryption is sensible to compare to what we want to do in PG for
> > > data-at-rest, but then we shouldn't be talking about what websites do,
> > > it'd make more sense to look at other data-at-rest encryption systems
> > > and consider what they're doing.
> > 
> > (I talked to Joe on chat for clarity.)  In modern TLS, the certificate is
> > used only for authentication, and Diffie–Hellman is used for key
> > exchange:
> > 
> >     https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange
> 
> Right, and the key that's figured out for each connection is at least
> specific to the server AND client keys/certificates, thus meaning that
> they're changed at least as frequently as those change (and clients end
> up creating ones on the fly randomly if they don't have one, iirc).
> 
> > So, the question is whether you can pass so much data in TLS that using
> > the same key for the entire session is a security issue.  TLS originally
> > had key renegotiation, but that was removed in TLS 1.3:
> > 
> >     https://www.cloudinsidr.com/content/known-attack-vectors-against-tls-implementation-vulnerabilities/
> >     To mitigate these types of attacks, TLS 1.3 disallows renegotiation.
> 
> It was removed due to attacks targeting the renegotiation, not because
> doing re-keying by itself was a bad idea, or because using multiple keys
> was seen as a bad idea.

Well, if it was a necessary features, I assume TLS 1.3 would have found
a way to make it secure, no?  Certainly they are not shipping TLS 1.3
with a known weakness.

> > Of course, a database is going to process even more data so if the
> > amount of data encrypted is a problem, we might have a problem too in
> > using a single key.  This is not related to whether we use one key for
> > the entire cluster or multiple keys per tablespace --- the problem is
> > the same.  I guess we could create 1024 keys and use the bottom bits of
> > the block number to decide what key to use.  However, that still only
> > pushes the goalposts farther.
> 
> All of this is about pushing the goalposts farther away, as I see it.
> There's going to be trade-offs here and there isn't going to be any "one
> right answer" when it comes to this space.  That's why I'm inclined to
> argue that we should try to come up with a relatively *good* solution
> that doesn't create a huge amount of work for us, and then build on
> that.  To that end, leveraging metadata that we already have outside of
> the catalogs (databases, tablespaces, potentially other information that
> we store, essentially, in the filesystem metadata already) to decide on
> what key to use, and how many we can support, strikes me as a good
> initial target.

Yes, we will need that for a usable nonce that we don't need to store in
the blocks and WAL files.

> > Anyway, I will to research the reasonable data size that can be secured
> > with a single key via AES.  I will look at how PGP encrypts large files
> > too.
> 
> This seems unlikely to lead to a definitive result, but it would be
> interesting to hear if there have been studies around that and what
> their conclusions were.

I found this:


https://crypto.stackexchange.com/questions/44113/what-is-a-safe-maximum-message-size-limit-when-encrypting-files-to-disk-with-aes
    https://crypto.stackexchange.com/questions/20333/encryption-of-big-files-in-java-with-aes-gcm/20340#20340

The numbers listed are:

    Maximum Encrypted Plaintext Size:  68GB
    Maximum Processed Additional Authenticated Data: 2 x 10^18

The 68GB value is "the maximum bits that can be processed with a single
key/IV(nonce) pair."  We would 8k of data for each 8k page.  If we
assume a unique nonce per page that is 10^32 bytes.

For the WAL we would probably use a different nonce for each 16MB page,
so we would be OK there too, since that is 10 ^ 36 bytes.

gives us 10^36 bytes before the segment number causes the nonce to
repeat.

> When it comes to concerns about autovacuum or other system processes,
> those don't have any direct user connections or interactions, so having
> them be more privileged and having access to more is reasonable.

Well, I am trying to understand the value of having some keys accessible
by some parts of the system, and some not.  I am unclear what security
value that has.

> Ideally, all of this would leverage a vaulting system or other mechanism
> which manages access to the keys and allows their usage to be limited.
> That's been generally accepted as a good way to bridge the gap between
> having to ask users every time for a key and having keys stored
> long-term in memory.  Having *only* the keys for the data which the
> currently connected user is allowed to access would certainly be a great
> initial capability, even if system processes (including potentially WAL
> replay) have to have access to all of the keys.  And yes, shared buffers
> being unencrypted and accessible by every backend continues to be an
> issue- it'd be great to improve on that situation too.  I don't think
> having everything encrypted in shared buffers is likely the solution,
> rather, segregating it up might make more sense, again, along similar
> lines to keys and using metadata that's outside of the catalogs, which
> has been discussed previously, though I don't think anyone's actively
> working on it.

What is this trying to protect against?  Without a clear case, I don't
see what that complexity is buying us.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Add missing operator <->(box, point)
Next
From: Tom Lane
Date:
Subject: Re: Excessive memory usage in multi-statement queries w/ partitioning