Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | CAD21AoAoBRO3E9RFqkmz7ieoBoUR0az9UG16mXotcSC28juUAg@mail.gmail.com Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
|
List | pgsql-hackers |
On Fri, Aug 23, 2019 at 11:35 PM Stephen Frost <sfrost@snowman.net> wrote: > > Greetings, > > * Bruce Momjian (bruce@momjian.us) wrote: > > On Fri, Aug 23, 2019 at 07:45:22AM -0400, Stephen Frost wrote: > > > Having listed out the feature set of each of the other major databases > > > when it comes to TDE is exactly how we objectively look at what is being > > > done in the industry, and that then gives us an understanding of what > > > users (and auditors) coming from other platforms will expect. > > > > > > I entirely agree that we shouldn't just copy N feature from X other > > > database system unless we feel that's the best approach, but when every > > > other database system out there has capability Y for the general feature > > > X that we're thinking about implementing, we should be questioning an > > > approach which doesn't include that. > > > > Agreed. The features of other databases are a clear source for what we > > should consider and run through the useful/reasonable filter. > > Following on from that- when other databases don't have something that > we're thinking about implementing, maybe we should be contemplating if > it really makes sense as a requirement for us. > > Specifically in this case- I went back and tried to figure out what > other database systems have an "encrypt EVERYTHING" option. I didn't > have much luck finding one though. So I think we need to ask ourselves- > the "check box" that we're trying to check off with TDE, do the other > database system check that box? If so, then it looks like the "check > box" isn't actually "encrypt EVERYTHING", it's more along the lines of > "make sure all regular user data is encrypted automatically" or some > such, and that's a very different requirement, which seems to be > answered by the other systems by having a KMS + tablespace/database > level encryption. We certainly shouldn't be putting a lot of effort > into building something that is either overkill or won't be interesting > to users due to limitations like "have to take the entire cluster > offline to re-key it". > > Now, that KMS has to be encrypted using a master key, of course, and we > have to make sure that it is able to survive across a crash, and it'd > sure be nice if it was indexed. One option for such a KMS would be > something entirely external (which could potentially just be another PG > database or something) but it'd be nice if we had something built-in. > We might also want it to be replicated (or maybe we don't, as was > discussed on the call, to allow for a replica to use an independent set > of keys- of course that leads to issues with pg_rewind and such though). I think most user would expect the physical standby server uses the same key as the primary server's one, at least for the master key. Otherwise they would need to use different keys every time of fail over. Even for WAL encryption keys, since it's common to fetch archived WAL files that are produced by the primary server by restore_command using scp the standby server needs to use the same keys or at least know it. In logical replication, I think that since we would sent unencrypted data and encrypt it on the subscriber that is initiated as a different database cluster we can use the different keys on both sides. > Anything built-in does seem like it'd be a fair bit of work to get it to > address those requirements, but that does seem to be what the other > database systems have done. Unfortunately, their documentation doesn't > seem to really say exactly what they've done to address that. I guess that this depends on the number of encryption keys we use. If we have encryption keys per tablespace or database the number of keys would be at most several dozen or several hundred. It's enough to have them in flat-file format on the disk and to load them to the hash table on the shared memory. We would not need a complex mechanism. OTOH if we have keys per tables, we would need to consider indexes and buffering as they might not fit in the memory. > A couple random ideas that probably won't work, but I'll put them out > there for others to shoot down- > > Some kind of 2-phase WAL pass, where we do WAL replay for the > non-encrypted bits first (which would include the KMS) and then go back > and WAL replay the encrypted stuff. Seems terrible. > > An independent WAL for the KMS only. Ugh, do we need another walwriter > then? and buffers, and lots of other stuff. > > Some kind of flat-file based approach with a temp file and renaming of > files using durable_rename(), like what we used to do with > pg_shadow/authid, and now do with replorigin_checkpoint and such? The PoC patch I created does that for the keyring file. When key rotation, the correspond WAL contains all re-encrypted keys with the master key identifier, and the recovery restores the keyring file. One good point of this approach is that external tools and startup process read it easier. It doesn't require backend codes such as system cache and heap functions. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: