Re: pgsql: Add pg_alterckey utility to change the cluster key - Mailing list pgsql-committers
From | Fabien COELHO |
---|---|
Subject | Re: pgsql: Add pg_alterckey utility to change the cluster key |
Date | |
Msg-id | alpine.DEB.2.22.394.2012280907150.2094581@pseudo Whole thread Raw |
In response to | pgsql: Add pg_alterckey utility to change the cluster key (Bruce Momjian <bruce@momjian.us>) |
List | pgsql-committers |
Hello Bruce, I put the thread back on hackers. >> The first two keys are stored in pg_cryptokeys/ in the data directory, >> while the third one is retrieved using a GUC for validation at server >> startup for the other two. >> Do we necessarily have to store the first level keys within the data >> directory? I guess that this choice has been made for performance, but >> is that really something that a user would want all the time? AES256 >> is the only option available for the data keys. What if somebody wants >> to roll in their own encryption? > > To clarify, we encrypt the data keys using AES256, but the data keys > themselves can be 128, 192, or 256 bits. > >> Companies can have many requirements in terms of accepting the use of >> one option or another. > > I think ultimately we will need three commands to control the keys. > First, there is the cluster_key_command, which we have now. Second, I > think we will need an optional command which returns random bytes --- > this would allow users to get random bytes from a different source than > that used by the server code. > > Third, we will probably need a command that returns the data encryption > keys directly, either heap/index or WAL keys, probably based on key > number --- you pass the key number you want, and the command returns the > data key. There would not be a cluster key in this case, but the > command could still prompt the user for perhaps a password to the KMS > server. It could not be used if any of the previous two commands are > used. I assume an HMAC would still be stored in the pg_cryptokeys > directory to check that the right key has been returned. Yep, my point is that it should be possible to have the whole key management outside of postgres. This said, postgres should provide a reasonable default implementation, obviously, preferably by using the provided mechanism (*NOT* a direct internal implementation and a possible switch to something else, IMHO, because then it would not be tested for whether it provides the right level of usability). I agree that keys need to be identified. I somehow disagree with the naming of the script and the implied usage. ISTM that there could be an external interface: - to initialize something. It may start a suid process, it may connect to a remote host, it may ask for a master password, who knows? /path/to/init --options arguments… the init process would return something which would be reused later on, eg an authentication token, or maybe a path to a socket for communication, or a file which contains something, or even a master/cluster key, but not necessarily. It may be anything. How it is passed to the next process/connection is an open question. Maybe on its stdin? - to start a process (?) which provide keys, either created (new) or existing (get), and possibly destroy them (or not?). The init process result should/could be passed somehow to this process, which may be suid something else. Another option would be to rely on some IPC mechanism. I'm not sure what the best choice is. ISTM that this process/connection could/should be persistent, with a simplistic text or binary based client/server interface. What this process/connection does it beyond postgres. In my mind, it could implement getting random data as well. I'd suggest that under no circumstances should the postgres process create cryptographic keys, although it should probably name them with some predefine length limit. /path/to/run --options arguments… Then there should be an postgres internal interface to store the results for local processing, retrieve them when needed, and so on, ok. ISTM that there should also be an internal interface to load the cryptographic primitives. Possibly a so/dll would do, or maybe just an extension mechanism which would provide the necessary functions, but this raise the issue of bootstraping, so maybe not so great an idea. The functions should probably be able to implement a counter mode, so that actual keys depend on the page position in file position, but what is really does is not postgres concern. A cryptographic concern for me is whether it would be possible to have authentication/integrity checks associated to each page. This means having the ability to reserve some space somewhere, possibly 8-16 bytes, in a page. Different algorithm could have different space requirements. The same interface should be used by other back-end commands (pg_upgrade, whatever). Somehow, the design should be abstract, without implying much, so that very different interfaces could be provided in term of whether there exists a master key, how keys are derived, what key sizes are, what algorithms are used, and so on. Postgres itself should not store keys, only key identifiers. I'm wondering whether replication should be able to work without some/all keys, so that a streaming replication could be implemented without the remote host being fully aware of the cryptographic keys. Another functional point is to allow changing the underlying key for a file, and discuss how this could work with the interface, as I noted that it was a desired feature. I'd suggest that maybe this should be based on changing the name of the "key", so that the external key management would not need to know about it. How to achieve that as a transaction is an open question. Maybe it should be an change outside of postgres, which modifies files at the cluster level with the database stopped. > I thought we should implement the first command, because it will > probably be the most common and easiest to use, and then see what people > want added. I somehow disagree: I think that pg should provide from the start the full generic interface, *and* a reasonable implementation which is what the current proposal does, fine with me. A simplistic test-oriented interface could be implemented in a scripting language. I think that great care must be put upfront in the overall design, so that it can be reused later on by people with pretty different requirements (in term of auditors, legal constraints, functions, whatever). I would like to avoid providing an half-baked design which suits some use-cases but cannot be used for others, because of key design choices. From a number of line of code point of view, it may not change much, really, this is more about design and putting functionalities in the right places. Now I intend to give some time to review patches with this in mind. Maybe I'll have some time at the end of the next CF, or the next. -- Fabien.
pgsql-committers by date: