Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Date
Msg-id 20190808222100.hbe2t32rhhs6sk6g@development
Whole thread Raw
In response to Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Stephen Frost <sfrost@snowman.net>)
Responses Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
List pgsql-hackers
On Thu, Aug 08, 2019 at 03:07:59PM -0400, Stephen Frost wrote:
>Greetings,
>
>* Bruce Momjian (bruce@momjian.us) wrote:
>> On Tue, Jul  9, 2019 at 11:09:01AM -0400, Bruce Momjian wrote:
>> > On Tue, Jul  9, 2019 at 10:59:12AM -0400, Stephen Frost wrote:
>> > > * Bruce Momjian (bruce@momjian.us) wrote:
>> > > I agree that all of that isn't necessary for an initial implementation,
>> > > I was rather trying to lay out how we could improve on this in the
>> > > future and why having the keying done at a tablespace level makes sense
>> > > initially because we can then potentially move forward with further
>> > > segregation to improve the situation.  I do believe it's also useful in
>> > > its own right, to be clear, just not as nice since a compromised backend
>> > > could still get access to data in shared buffers that it really
>> > > shouldn't be able to, even broadly, see.
>> >
>> > I think TDE is feature of questionable value at best and the idea that
>> > we would fundmentally change the internals of Postgres to add more
>> > features to it seems very unlikely.  I realize we have to discuss it so
>> > we don't block reasonable future feature development.
>>
>> I have a new crazy idea.  I know we concluded that allowing multiple
>> independent keys, e.g., per user, per table, didn't make sense since
>> they have to be unlocked all the time, e.g., for crash recovery and
>> vacuum freeze.
>
>I'm a bit confused as I never agreed that made any sense and I continue
>to feel that it doesn't make sense to have one key for everything.
>
>Crash recovery doesn't happen "all the time" and neither does vacuum
>freeze, and autovacuum processes are independent of individual client
>backends- we don't need to (and shouldn't) have the keys in shared
>memory.
>

Don't people do physical replication / HA pretty much all the time?


>> However, that assumes that all heap/index pages are encrypted, and all
>> of WAL.  What if we encrypted only the user-data part of the page, i.e.,
>> tuple data.  We left xmin/xmax unencrypted, and only stored the
>> encrypted part of that data in WAL, and didn't encrypt any more of WAL.
>
>This is pretty much what Alvaro was suggesting a while ago, isn't it..?
>Have just the user data be encrypted in the table and in the WAL stream.
>

It's also moving us much closer to pgcrypto-style encryption ...

>> That might allow crash recovery and the freeze part of VACUUM FREEZE to
>> work.  (I don't think we could vacuum since we couldn't read the index
>> pages to find the matching rows since the index values would be encrypted
>> too.  We might be able to not encrypt the tid in the index typle.)
>
>Why do we need the indexed values to vacuum the index..?  We don't
>today, as I recall.  We would need the tids though, yes.
>

Well, we also do collect statistics on the data, for example. But even
if we assume we wouldn't do that for encrypted indexes (which seems like
a pretty bad idea to me), you'd probably end up leaking information
about ordering of the values. Which is generally a pretty serious
information leak, AFAICS.

>> Is this something considering in version one of this feature?  Probably
>> not, but later?  Never?  Would the information leakage be too great,
>> particularly from indexes?
>
>What would be leaking from the indexes..?  That an encrypted blob in the
>index pointed to a given tid?  Wouldn't someone be able to see that same
>information by looking directly at the relation too?
>

Ordering of values, for example. Depending on how exactly the data is
encrypted we might also be leaking information about which values are
equal, etc. It also seems quite a bit more expensive to use such index.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Locale support
Next
From: Stephen Frost
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)