Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Date
Msg-id 20190705202439.tw7tgwld4fnhrovi@development
Whole thread Raw
In response to Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Bruce Momjian <bruce@momjian.us>)
Responses Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
List pgsql-hackers
On Fri, Jul 05, 2019 at 03:38:28PM -0400, Bruce Momjian wrote:
>On Sun, Jun 16, 2019 at 03:57:46PM -0400, Stephen Frost wrote:
>> Greetings,
>>
>> * Bruce Momjian (bruce@momjian.us) wrote:
>> > On Sun, Jun 16, 2019 at 12:42:55PM -0400, Joe Conway wrote:
>> > > On 6/16/19 9:45 AM, Bruce Momjian wrote:
>> > > > On Sun, Jun 16, 2019 at 07:07:20AM -0400, Joe Conway wrote:
>> > > >> In any case it doesn't address my first point, which is limiting the
>> > > >> volume encrypted with the same key. Another valid reason is you might
>> > > >> have data at varying sensitivity levels and prefer different keys be
>> > > >> used for each level.
>> > > >
>> > > > That seems quite complex.
>> > >
>> > > How? It is no more complex than encrypting at the tablespace level
>> > > already gives you - in that case you get this property for free if you
>> > > care to use it.
>> >
>> > All keys used to encrypt WAL data must be unlocked at all times or crash
>> > recovery, PITR, and replication will not stop when it hits a locked key.
>> > Given that, how much value is there in allowing a key per tablespace?
>>
>> There's a few different things to discuss here, admittedly, but I don't
>> think it means that there's no value in having a key per tablespace.
>>
>> Ideally, a given backend would only need, and only have access to, the
>> keys for the tablespaces that it is allowed to operate on.  I realize
>> that's a bit farther than what we're talking about today, but hopefully
>> not too much to be able to consider.
>
>What people really want with more-granular-than-cluster encryption is
>the ability to supply their passphrase key _when_ they want to access
>their data, and then leave and be sure their data is secure from
>decryption.  That will not be possible since the WAL will be encrypted
>and any replay of it will need their passphrase key to unlock it, or the
>entire system will be unrecoverable.
>
>This is a fundamental issue, and will eventually doom any more granular
>encryption approach, unless we want to use the same key for all
>encrypted tablespaces, create separate WALs for each tablespace, or say
>recovery of some tablespaces will fail.  I doubt any of those will be
>acceptable.
>

I agree this is a pretty crucial challenge, and those requirements seem
in direct conflict. Users use encryption to protect privacy of the data,
but we need access to some of the data to implement some of the
important tasks of a RDBMS.

And it's not just about things like recovery or replication. How do you
do ANALYZE on encrypted data? Sure, if a user runs it in a session that
has the right key, that's fine. But what about autovacuum/autoanalyze?

I suspect the issue here is that we're trying to retrofit a solution for
data-at-rest encryption to something that seems closer to protecting
data during execution.

Which is a worthwhile goal, of course, but perhaps we're trying to use
the wrong tool to achieve it? To paraphrase the hammer/nail saying "If
all you know is a block encryption, everything looks like a block."


What if the granular encryption (not the "whole cluster with a single
key") case does not encrypt whole blocks, but just tuple data? Would
that allow at least the most critical WAL use cases (recovery, physical
replication) to work without having to know all the encryption keys?

Of course, that would be a much less efficient compared to plain block
encryption, but that may be the "natural cost" of the feature.

It would not solve e.g. logical replication or ANALYZE, which both
require access to the plaintext data, though.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs
Next
From: Alvaro Herrera
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)