Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | 20190706160514.b67q4f7abcxfdahk@momjian.us Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
|
List | pgsql-hackers |
On Fri, Jul 5, 2019 at 10:24:39PM +0200, Tomas Vondra wrote: > I agree this is a pretty crucial challenge, and those requirements seem > in direct conflict. Users use encryption to protect privacy of the data, > but we need access to some of the data to implement some of the > important tasks of a RDBMS. > > And it's not just about things like recovery or replication. How do you > do ANALYZE on encrypted data? Sure, if a user runs it in a session that > has the right key, that's fine. But what about autovacuum/autoanalyze? There might be a way to defer ANALYZE and autovacuum/autoanalyze, but what about VACUUM FREEZE? We can defer that too, but not the clog truncation that is eventually the product of the freeze. What about referential integrity constraints that need to check primary keys in the encrypted tables? I also don't see a way of delaying that, and if you can't do referential integrity into the encrypted tables, it reduces the value of having encrypted data in the same database rather than in another database or cluster? I still feel we have not clearly described what the options are: 1. Encrypt everything 2. Encrypt only some tables (for performance reasons), and use only one key, or use multiple keys to allow for key rotation. All keys are always unlocked. 3. Encrypt only some tables with different keys, and the keys are not always unlocked. As Tomas already stated, using tablespaces to distinguish encrypted from non-encrypted tables doesn't make sense since the file system used for the storage is immaterial to the encryptions status. An easier way would be to just add a bit to WAL that would indicate if the rest of the WAL record is encrypted, though I doubt the performance boost is worth the complexity. I see the attraction of #3, but operationally it is unclear how we can decouple data that is not always accessible, as outlined above. We could probably work around the WAL issues, but it is going to be much more overhead. It is also unclear how the user supplies the keys --- are they done at boot time, and if so, how are later keys unlocked, or does the client provide it? If the client provides it, isn't it better to do client-side encryption, or have the client use pgcrypto with some key management around it like pgcryptokey? This presentation shows how to use triggers to implement transparent encryption at the column level: https://momjian.us/main/writings/crypto_hw_use.pdf#page=77 Structurally, I understand the desire to push key control out to users in #3, but it then becomes very complex to construct a system where the data is tightly coupled in a Postgres cluster. The pgcrypto method above works because it decouples row control, like xmin/xmax and WAL replay, which is not encrypted, with the row payload, which is encrypted. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
pgsql-hackers by date: