Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Date
Msg-id CAD21AoAPQGgVPAg2HOy856qzWEjOTQFiajucAtveyq8xdnzDDg@mail.gmail.com
Whole thread Raw
In response to Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Sat, Aug 10, 2019 at 12:18 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On Fri, Aug 09, 2019 at 11:51:23PM +0900, Masahiko Sawada wrote:
> >On Fri, Aug 9, 2019 at 10:25 AM Bruce Momjian <bruce@momjian.us> wrote:
> >>
> >> On Thu, Aug  8, 2019 at 06:31:42PM -0400, Stephen Frost wrote:
> >> > > >Crash recovery doesn't happen "all the time" and neither does vacuum
> >> > > >freeze, and autovacuum processes are independent of individual client
> >> > > >backends- we don't need to (and shouldn't) have the keys in shared
> >> > > >memory.
> >> > >
> >> > > Don't people do physical replication / HA pretty much all the time?
> >> >
> >> > Strictly speaking, that isn't actually crash recovery, it's physical
> >> > replication / HA, and while those are certainly nice to have it's no
> >> > guarantee that they're required or that you'd want to have the same keys
> >> > for them- conceptually, at least, you could have WAL with one key that
> >> > both sides know and then different keys for the actual data files, if we
> >> > go with the approach where the WAL is encrypted with one key and then
> >> > otherwise is plaintext.
> >>
> >> Uh, yes, you could have two encryption keys in the data directory, one
> >> for heap/indexes, one for WAL, both unlocked with the same passphrase,
> >> but what would be the value in that?
> >>
> >> > > >>That might allow crash recovery and the freeze part of VACUUM FREEZE to
> >> > > >>work.  (I don't think we could vacuum since we couldn't read the index
> >> > > >>pages to find the matching rows since the index values would be encrypted
> >> > > >>too.  We might be able to not encrypt the tid in the index typle.)
> >> > > >
> >> > > >Why do we need the indexed values to vacuum the index..?  We don't
> >> > > >today, as I recall.  We would need the tids though, yes.
> >> > >
> >> > > Well, we also do collect statistics on the data, for example. But even
> >> > > if we assume we wouldn't do that for encrypted indexes (which seems like
> >> > > a pretty bad idea to me), you'd probably end up leaking information
> >> > > about ordering of the values. Which is generally a pretty serious
> >> > > information leak, AFAICS.
> >> >
> >> > I agree entirely that order information would be bad to leak- but this
> >> > is all new ground here and we haven't actually sorted out what such a
> >> > partially encrypted btree would look like.  We don't actually have to
> >> > have the down-links in the tree be unencrypted to allow vacuuming of
> >> > leaf pages, after all.
> >>
> >> Agreed, but I think we kind of know that the value in cluster-wide
> >> encryption is different from multi-key encryption --- both have their
> >> value, but right now cluster-wide is the easiest and simplest, and
> >> probably meets more user needs than multi-key encryption.  If others
> >> want to start scoping out what multi-key encryption would look like, we
> >> can discuss it.  I personally would like to focus on cluster-wide
> >> encryption for PG 13.
> >
> >I agree that cluster-wide is more simpler but I'm not sure that it
> >meets real needs from users. One example is re-encryption; when the
> >key leakage happens, in cluster-wide encryption we end up with doing
> >re-encrypt whole database regardless the amount of user sensitive data
> >in database. I think it's a big constraint for users because it's
> >common that the amount of data such as master table that needs to be
> >encrypted doesn't account for a large potion of database. That's one
> >reason why I think more fine granularity encryption such as
> >table/tablespace is required.
> >
>
> TBH I think it's mostly pointless to design for key leakage.
>
> My understanding it that all this work is motivated by the assumption that
> Bob can obtain access to the data directory (say, a backup of it). So if
> he also manages to get access to the encryption key, we probably have to
> assume he already has access to current snapshot of the data directory,
> which means any re-encryption is pretty futile.
>
> What we can (and should) optimize for is key rotation, but as that only
> changes the master key and not the actual encryption keys, the overhead is
> pretty low.
>
> We can of course support "forced" re-encryption, but I think it's
> acceptable if that's fairly expensive as long as it can be throttled and
> executed in the background (kinda similar to the patch to enable checksums
> in the background).

I'm not sure that we can ignore the risk of MDEK leakage. Once MDEK is
leaked for whatever reason all that is left for attacker is to steal
data. User who realized that MDEK is leaked will have to re-encrypt
data. Even if the data is already stolen user will want to re-encrypt
data to protect further attacks. KEK rotation is futile in this case.

>
> >And in terms of feature development we would implement
> >fine-granularity encryption in the future even if the first step is
> >cluster-wide encryption? And both TDEs encrypt the same kind of
> >database objects (i.e. only  tables , indexes and WAL)? If so, how
> >does users  use them depending on cases?
> >
> >I imagined the case where we had the cluster-wide encryption as the
> >first TDE feature. We will enable TDE at initdb time by specifying the
> >command-line parameter for TDE. Then TDE is enabled in cluster wide
> >and all tables/indexes and WAL are automatically encrypted. Then, if
> >we want to implement the more fine granularity encryption how can we
> >make users use it? WAL encryption and tables/index encryption are
> >enabled at the same time but we want to enable encryption for
> >particular tables/indexes after initdb. If the cluster-wide encryption
> >is something like a short-cut of encrypting all tables/indexes, I
> >personally think that implementing the more fine granularity one first
> >and then using it to achieve the more coarse granularity would be more
> >easier.
> >
>
> Not sure, but I'd expect it to be the other way around, i.e. the more
> granular encryption being more complicated. One reason is that with
> cluster-wide you can just assume everything is encrypted and handle it the
> same way, while with fine-grained encryption you need to whether each
> individual object is encrypted, maybe handle it in different ways, etc.
>
> But that's just my guess, really.
>

I meant about the case where we want to implement both
functionality(i.e., cluster wide for encryption everything and
table/tablespace level for finer granularity encryption). If we want
to have only either one the cluster-wide is easier as you mentioned.
But if we want to have both of them I think that implementing finer
granularity encryption first and using it to achieve coarse
granularity encryption would be easier.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: default_table_access_method is not in sample config file
Next
From: "Smith, Peter"
Date:
Subject: RE: [Proposal] Table-level Transparent Data Encryption (TDE) andKey Management Service (KMS)