Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | 20190808223142.GI16436@tamriel.snowman.net Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
List | pgsql-hackers |
Greetings, * Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote: > On Thu, Aug 08, 2019 at 03:07:59PM -0400, Stephen Frost wrote: > >* Bruce Momjian (bruce@momjian.us) wrote: > >>On Tue, Jul 9, 2019 at 11:09:01AM -0400, Bruce Momjian wrote: > >>> On Tue, Jul 9, 2019 at 10:59:12AM -0400, Stephen Frost wrote: > >>> > * Bruce Momjian (bruce@momjian.us) wrote: > >>> > I agree that all of that isn't necessary for an initial implementation, > >>> > I was rather trying to lay out how we could improve on this in the > >>> > future and why having the keying done at a tablespace level makes sense > >>> > initially because we can then potentially move forward with further > >>> > segregation to improve the situation. I do believe it's also useful in > >>> > its own right, to be clear, just not as nice since a compromised backend > >>> > could still get access to data in shared buffers that it really > >>> > shouldn't be able to, even broadly, see. > >>> > >>> I think TDE is feature of questionable value at best and the idea that > >>> we would fundmentally change the internals of Postgres to add more > >>> features to it seems very unlikely. I realize we have to discuss it so > >>> we don't block reasonable future feature development. > >> > >>I have a new crazy idea. I know we concluded that allowing multiple > >>independent keys, e.g., per user, per table, didn't make sense since > >>they have to be unlocked all the time, e.g., for crash recovery and > >>vacuum freeze. > > > >I'm a bit confused as I never agreed that made any sense and I continue > >to feel that it doesn't make sense to have one key for everything. > > > >Crash recovery doesn't happen "all the time" and neither does vacuum > >freeze, and autovacuum processes are independent of individual client > >backends- we don't need to (and shouldn't) have the keys in shared > >memory. > > Don't people do physical replication / HA pretty much all the time? Strictly speaking, that isn't actually crash recovery, it's physical replication / HA, and while those are certainly nice to have it's no guarantee that they're required or that you'd want to have the same keys for them- conceptually, at least, you could have WAL with one key that both sides know and then different keys for the actual data files, if we go with the approach where the WAL is encrypted with one key and then otherwise is plaintext. > >>However, that assumes that all heap/index pages are encrypted, and all > >>of WAL. What if we encrypted only the user-data part of the page, i.e., > >>tuple data. We left xmin/xmax unencrypted, and only stored the > >>encrypted part of that data in WAL, and didn't encrypt any more of WAL. > > > >This is pretty much what Alvaro was suggesting a while ago, isn't it..? > >Have just the user data be encrypted in the table and in the WAL stream. > > It's also moving us much closer to pgcrypto-style encryption ... Yes, it is, and there's good parts and bad parts to that, to be sure. > >>That might allow crash recovery and the freeze part of VACUUM FREEZE to > >>work. (I don't think we could vacuum since we couldn't read the index > >>pages to find the matching rows since the index values would be encrypted > >>too. We might be able to not encrypt the tid in the index typle.) > > > >Why do we need the indexed values to vacuum the index..? We don't > >today, as I recall. We would need the tids though, yes. > > Well, we also do collect statistics on the data, for example. But even > if we assume we wouldn't do that for encrypted indexes (which seems like > a pretty bad idea to me), you'd probably end up leaking information > about ordering of the values. Which is generally a pretty serious > information leak, AFAICS. I agree entirely that order information would be bad to leak- but this is all new ground here and we haven't actually sorted out what such a partially encrypted btree would look like. We don't actually have to have the down-links in the tree be unencrypted to allow vacuuming of leaf pages, after all. > >>Is this something considering in version one of this feature? Probably > >>not, but later? Never? Would the information leakage be too great, > >>particularly from indexes? > > > >What would be leaking from the indexes..? That an encrypted blob in the > >index pointed to a given tid? Wouldn't someone be able to see that same > >information by looking directly at the relation too? > > Ordering of values, for example. Depending on how exactly the data is > encrypted we might also be leaking information about which values are > equal, etc. It also seems quite a bit more expensive to use such index. Using an encrypted index isn't going to be free. It's not clear that this would be much more expensive than if the entire index is encrypted, or that people would actually be unhappy if there was such an additional expense if it meant that they could have vacuum run without the keys. Thanks, Stephen
Attachment
pgsql-hackers by date: