Re: Key management with tests - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Key management with tests |
Date | |
Msg-id | CA+TgmobUS2Y2OXRuQGs_i1ReJO2fJmLKm=sZ1+aGRqc42oPYPQ@mail.gmail.com Whole thread Raw |
In response to | Re: Key management with tests (Tom Kincaid <tomjohnkincaid@gmail.com>) |
Responses |
Re: Key management with tests
|
List | pgsql-hackers |
On Mon, Jan 18, 2021 at 2:00 PM Tom Kincaid <tomjohnkincaid@gmail.com> wrote: > Some of the missing things you mention above are about the design of > TDE feature in general. However, this patch is about Key Management > which is going part of the larger TDE feature. So it feels as though > there is the need for a general design document about the overall > vision / approach for TDE and a specific design doc. for Key > Management. Is it appropriate to include both of those in the same > patch? To me, it wouldn't make sense to commit a full README for a TDE feature that we don't have yet with a key management patch, but the way that they'll interact with each other has to be clear. The doc/database-encryption.sgml file that Bruce included in the patch is a decent start on explaining the design, though I think it needs more work and more details, perhaps including some of the things Andres mentioned. To be honest, after reading over that SGML documentation a bit, I'm somewhat skeptical about whether it really makes sense to think about committing the key management part separately. It seems to have no use independent of the main feature, and it in fact embeds very specific details of how the main feature is expected to work. For example, the documentation says that key #0 will be used for data files, and key #1 for WAL. There seems to be no suggestion that the key management portion of this can be used to manage encryption keys generally for whatever purposes someone might have; it's all about the needs of a particular TDE implementation. Typically, we would not commit something like that separately, or only once the main patch was done, with the two commits occurring in a relatively short time period. Otherwise, as Bruce already noted, we can end up with something that is documented and visible to users but doesn't actually work yet. Some more specific comments on data-encryption.sgml: * The documentation explains that the purpose of having a WAL key separate from the data file key is so that the data file keys can "eventually" be rotated. It's not clear whether this means that we might eventually have that feature or that we might eventually be able to rotate, after failing over. If this kind of thing is possible, we'll eventually need documentation on how to do it. * The reasons for use a DEK and a KEK are not explained. I realize it's not an uncommon practice and that other systems do it, but I think a few sentences of explanation wouldn't be a bad idea. Even if we are supposing that hackers who want to have input into this feature have to be knowledgeable about cryptography, I don't think we can reasonably suppose that for users. * "For example" is at one point followed by a period rather than a colon or comma. * In the "Internals" subsection, the last sentence doesn't seem to be grammatical. I wonder if it's missing the word "or"'. * The part about integrity-checking keys on startup isn't clear. It makes it sound like we still have a copy of the KEK lying around someplace against which we can compare, which I assume is not the case since it would be really insecure. * I think it's going to be pretty important that we can easily switch to other cryptographic algorithms as they are discovered, so I don't like the fact that this is tied specifically to AES. (In fact, kmgr_utils.h makes it sound like we're specifically locked into AES256, but that contradicts the documentation, so I guess there's some clarification needed here about what exactly KMGR_CLUSTER_KEY_LEN is doing.) As far as possible we should try to make this generic, like supporting any cipher that SSL has which has property X. It seems relatively inevitable that every currently popular cryptographic algorithm will at some point in the future be judged weak and worthless, just as has already happened with MD5 and some variants of SHA, both of which used to be considered state of the art. It seems equally inevitable that new and stronger algorithms will continued to be devised, and we'll want to adopt those easily. I'm not sure to what extent this a serious flaw in the patch and to what extent it's just a matter of tweaking the wording of some things, but I think this is actually an extremely critical design point where we had better be certain we've got it right. Few things would be sadder than to get a TDE feature and then have to rip it out again because it couldn't be upgraded to work with newer crypto algorithms with reasonable effort. Notes on other parts of the documentation: * The documentation for initdb -K doesn't list the valid values of the parameter, only the default. Probably we should be specifying an algorithm here and not just a bit count. Otherwise, like I say above, what happens when AES gives way to something else? It'd be easy to say -K BFT256 instead of -K AES256, but if AES is assumed and it's no longer what we want them we have problems. This kind of thing probably needs to be cleaned up in a bunch of places. * I don't see the point of saying "a passphrase or PIN." We don't need to document that your passphrase might happen to only contain digits. * pg_alterckey's description of "repair" is hard to understand. It doesn't really explain why or how this would be necessary, and it begs the question of why we'd ever leave things in a state that requires repair. This is sketched out in code comments elsewhere, but I think at least some of it needs to be explained in the documentation as well. (Incidentally, I don't think the comments at the top of recover_failure will survive a visit from pgindent, though I might be wrong about that.) * The changes to config.sgml say "Sample script" instead of "Sample scripts". * I don't think that the documentation of %R is very clear, or adequate for someone to make effective use of it. If I wanted to use %R, how would I ensure that a value is available? * The changes to allfiles.sgml add pg_alterckey.sgml in the wrong place and include an incorrect whitespace change. * It's odd that "pg_alterckey" describes itself as "technically" changing the KEK. Isn't that just what it does, not a technicality? I imagine we'll ultimately need a way to change a DEK as well, because otherwise the use of a separate key for the WAL wouldn't accomplish the intended goal. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: