Re: Proposed patch for key management - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: Proposed patch for key management |
Date | |
Msg-id | alpine.DEB.2.22.394.2101012009090.3573723@pseudo Whole thread Raw |
In response to | Re: Proposed patch for key management (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: Proposed patch for key management
Re: Proposed patch for key management |
List | pgsql-hackers |
Hello Stephen, >> I'm unsure about what is the "common use case". ISTM that having the >> postgres process holding the master key looks like a "no" for me in any use >> case with actual security concern: as the database must be remotely >> accessible, a reasonable security assumption is that a pg account could be >> compromised, and the "master key" (if any, that is just one particular >> cryptographic design) should not be accessible in that case. The first >> barrier would be pg admin account. With external, the options are that the >> key is hold by another id, so the host must be compromised as well, and >> could even be managed remotely on another host (I agree that this later >> option would be fragile because of the implied network connection, but it >> would be a tradeoff depending on the security concerns, but it pretty much >> correspond to the kerberos model). > > No, this isn't anything like the Kerberos model and there's no case > where the PG account won't have access to the DEK (data encryption key) > during normal operation (except with the possibility of off-loading to a > hardware crypto device which, while interesting, is definitely something > that's of very limited utility and which could be added on as a > capability later anyway. This is also well understood by those who are > interested in this capability and it's perfectly acceptable that PG will > have, in memory, the DEK. I'm sorry that I'm not very clear. My ranting is having a KEK "master key" in pg memory. I think I'm fine at this point with having DEKs available on the host to code/decode files. What I meant about kerberos is that the setup I was describing was making the database dependent on a remote host, which is somehow what keroberos does. > The keys which are stored on disk with the proposed patch are encrypted > using a KEK which is external to PG- that's all part of the existing > patch and design. Yep. My point is that the patch hardwires a cryptographic design with many assumptions, whereas I think it should design an API compatible with a larger class of designs, and possibly implement one with KEK and so on, I'm fine with that. >>> Adding a pre-defined system will not prevent future work on a >>> completely external option. >> >> Yes and no. >> >> Having two side-by-side cluster-encryption scheme in core, the first that >> could be implemented on top of the second without much effort, would look >> pretty weird. Moreover, required changes in core to allow an API are the >> very same as the one in this patch. The other concern I have with doing the >> cleaning work afterwards is that the API would be somehow in the middle of >> the existing functions if it has not been thought of before. > > This has been considered and the functions which are proposed are > intentionally structured to allow for multiple implementations already, Does it? This is unclear to me from looking at the code, and the limited documentation does not point to that. I can see that the "kek provider" and the "random provider" can be changed, but the overall cryptographic design seems hardwired. > so it's unclear to me what the argument here is. The argument is that professional cryptophers do wrong designs, and I do not see why you would do different. So instead of hardwiring choice that you think are THE good ones, ISTM that pg should design a reasonably flexible API, and also provide an implementation, obviously. > Further, we haven't even gotten to actual cluster encryption yet- this > is all just the key management side of things, With hardwired choices about 1 KEK and 2 DEK that are debatable, see below. > which is absolutely a necessary and important part but argueing that it > doesn't address the cluster encryption approach in core simply doesn't > make any sense as that's not a part of this patch. > > Let's have that discussion when we actually get to the point of talking > about cluster encryption. > >>> I know archive_command is completely external, but that is much simpler >>> than what would need to be done for key management, key rotation, and key >>> verification. >> >> I'm not sure of the design of the key rotation stuff as I recall it from the >> patches I looked at, but the API design looks like the more pressing issue. >> First, the mere existence of an "master key" is a cryptographic choice which >> is debatable, because it creates issues if one must be able to change it, >> hence the whole "key rotation" stuff. ISTM that what core needs to know is >> that one particular key is changed by a new one and the underlying file is >> rewritten. It should not need to know anything about master keys and key >> derivation at all. It should need to know that the user wants to change file >> keys, though. > > No, it's not debatable that a KEK is needed, I disagree entirely. ISTM that there is cryptographic design which suits your needs and you seem to decide that it is the only possible way to do it It seems that you know. As a researcher, I know that I do not know:-) As a software engineer, I'm trying to suggest enabling more with the patch, including not hardwiring a three-key management scheme. I'm advocating that you pg not decide for others what is best for them, but rather allow multiple choices and implementations through an API. Note that I'm not claiming that the 1 KEK + 2 DEK + AES… design is bad. I'm claiming that it is just one possible design, quite peculiar though, and that pg should not hardwire it, and it does not need to anyway. The documentation included in the key management patch states: "Key zero is the key used to encrypt database heap and index files which are stored in the file system, plus temporary files created during database operation." Read as is, it suggests that the same key is used to encrypt many files. From a cryptographic point of view this looks like a bad idea. The mode used is unclear. If this is a stream cypher generated in counter mode, it could be a very bad idea. Hopefully this is not what is intended, but the included documentation is frankly misleading in that case. IMHO there should be one key per file. Also, the CTR mode should be avoided if possible because it has quite special properties, unless these properties are absolutely necessary. From what I see in the patch, the DEKs cannot be changed, only the KEK which encodes de DEK. I cannot say I'm thrilled by such a property. If the KEK is lost, then probably the DEKs are lost as well. Changing the KEK does not change anything about it, so that the attacker would be able to retrieve the contents of later/previous state of the database if they can retrieve a previous/later encoded version, even if the KEK is changed every day. This is a bad property which somehow provides a false sense of security. I'm not sure that it makes much sense to change a KEK without changing the associated DEK, security-wise. Anyway, I'd suggest to provide a script to re-encrypt a cluster, possibly while offline, with different DEKs. From an audit perspective, this could be a requirement. I'd expect a very detailed README or documentation somewhere, but there are none in the committed/uncommitted patch. In https://wiki.postgresql.org/wiki/Transparent_Data_Encryption, I see a lot of minute cryptographic design decision which are debatable, and a few which seem to be still opened, whereas other things are already in the process of being committed. I'm annoyed by the decision to reuse the same key for a lot of encryption, even if there are discussions about changing the iv in some way. If you have some clever way to build an iv, then please please change the key as well? ISTM that pg at the core level should (only) directly provide: (1) a per-file encryption scheme, with loadable (hook-replaceable functions??) to manage pages, maybe: encrypt(page_id, *key, *clear_page, *encrypted_page); decrypt(page_id, *key, *encrypted_page, *clear_page); What encrypt/decrypt does is beyond pg core stuff. Ok, a reasonable implementation should be provided, obviously, possibly in core. There may be some block-size issues if not full pages are encrypted, so maybe the interface needs to be a little more subtle. (2) offer a key management scheme interface, to manage *per-file* keys, possibly loadable (hook replaceable?). If all files have the same key, which is stored in a directory and encoded with a KEK, this is just one (debatable) implementation choice. For that, ISTM that what is needed at this level is: get_key(file_id (relative name? oid? 8 or 16 bytes something?)); Maybe that should be enough to create a new key implicitely, or maybe it should be a different interface, eg get_new_key(file_id), not sure. There is the question of the key length, which may for instance include an iv. Maybe an information seeking function should be provided, to know about sizes or whatever, not sure. Maybe there should/could be a parameter to manage key versions, not sure. The internal key infrastructure would interact with this/these functions when needed, and store the key in the file management structure. All this should be a black box to core pg, that is the point of an API. This is what I'd have expected to see in the "postgres core key management" stuff, not a hardwired 2-key store with a KEK. (3) ISTM that the key management interface should be external, or at least it should be possible to make it external easily. I do not think that there is a significant performance issue because keys are needed once, and once loaded they are there. A simple way to do that is a separate process with a basic protocol on stdin/stdout to implement "get_key", which is basically already half implemented in the patch for retrieving the KEK. > Perhaps we can have support for an external key store in the future for > the DEKs, but we really need to have our own key management and the > ability to reasonably rotate keys (which is what the KEK provides). I think that the KEK/2-DEK store is a peculiar design which does not *need* to be hardwired deeply in core. > Ideally, we'll get to a point where we can even have multiple DEKs and > deal with rotating them too, but that, if anything, point even stronger > to the need to have a key management system and KEK as we'll be dealing > with that many more keys. > >>> I will say that if the community feels external-only should be the only >>> option, I will stop working on this feature because I feel the result >>> would be too fragile to be reliable, >> >> I'm do not see why it would be the case. I'm just arguing to have key >> management in a separate, possibly suid something-else, process, which given >> the security concerns which dictates the feature looks like a must have, or >> at least must be possible. From a line count point of view, it should be a >> small addition to the current code. > > All of this hand-waving really isn't helping. I'm sorry that you feel that. I'm really trying to help improve the approach and design with my expertise. As I said in another mail, I'm no one, you can always chose to ignore me. > If it's a small addition to the current code then it'd be fantastic if > you'd propose a specific patch which adds what you're suggesting. My point is to discuss an API. Once there is some agreement the possible API outline, then implementations can be provided, but if people do not want an API in the first place, I'm not sure I can see the point of sending patches. > I don't think either Bruce or I would have any issue with others helping > out on this effort, but let's be clear- we need something that *is* part > of core PG, even if we have an ability to have other parts exist outside > of PG. Yep. I have not suggested that PG should not implement the API, on the contrary it definitely should. -- Fabien.
pgsql-hackers by date: