Re: Proposed patch for key managment - Mailing list pgsql-hackers
From | Alastair Turner |
---|---|
Subject | Re: Proposed patch for key managment |
Date | |
Msg-id | CAC0GmyzJNXRZ5MQJYq24EiUarP+xN9Q19_cknkHmmYU=CTzHrQ@mail.gmail.com Whole thread Raw |
In response to | Re: Proposed patch for key managment (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: Proposed patch for key managment
|
List | pgsql-hackers |
Hi Stephen On Fri, 18 Dec 2020 at 21:36, Stephen Frost <sfrost@snowman.net> wrote: > > Greetings Alastair, > > * Alastair Turner (minion@decodable.me) wrote: > > On Wed, 16 Dec 2020 at 22:43, Stephen Frost <sfrost@snowman.net> wrote: ... > > passphrase key wrapper, the secret store and the cloud/HW KMS. > > > > Since the examples expand the purpose of cluster_passphrase_command, > > let's call it cluster_key_challenge_command in the examples. > > These ideas don't seem too bad but I'd think we'd pass which key we want > on the command-line using a %i or something like that, rather than using > stdin, unless there's some reason that would be an issue..? Certainly > the CLI utilities I've seen tend to expect the name of the secret that > you're asking for to be passed on the command line. Agreed. I was working on the assumption that the calling process (initdb or pg_ctl) would have access to the challenge material and be passing it to the utility, so putting it in a command line would allow it to leak. If the utility is accessing the challenge material directly, then just an indicator of which key to work with would be a lot simpler, true. > > > Starting with the passphrase key wrapper, since it's what's in place now. > > > > - cluster_key_challenge_command = 'password_key_wrapper %q' > > I do tend to like this idea of having what > cluster_key_challenge_command, or whatever we call it, expects is an > actual key and have the command that is run be a separate command that > takes the passphrase and runs the KDF (key derivation function) on it, > when a passphrase is what the user wishes to use. > > That generally makes more sense to me than having the key derivation > effort built into the backend which I have a hard time seeing any > particular reason for, as long we're already calling out to some > external utility of some kind to get the key. > ... > > With the thought of trying to keep it a reasonably simple interface, I > had a thought along these lines: > > - Separate command that runs the KDF, this is simple enough as it's > really just a hash, and it returns the key on stdout. > - initdb time option that says if we're going to have PG manage the > sub-keys, or not. > - cluster_key_command defined as "a command that is passed the ID of > the key, or keys, required for the cluster to start" > > initdb --pg-subkeys > - Calls cluster_key_command once with "$PG_SYSTEM_ID-main" or similar > and expects the main key to be provided on stdout. Everything else > is then more-or-less as is today: PG generates DEK sub-keys for data > and WAL and then encrypts them with the main key and stores them. > > As long as the enveloped keys file exists on the filesystem, when PG > starts, it'll call the cluster_key_command and will expect the 'main' > key to be provided and it'll then decrypt and verify the DEK sub-keys, > very similar to today. > > In this scenario, cluster_key_command might be set to call a command > which accepts a passphrase and runs a KDF on it, or it might be set to > call out to an external vaulting system or to a Yubikey, etc. > > initdb --no-pg-subkeys > - Calls cluster_key_command for each of the sub-keys that "pg-subkeys" > would normally generate itself, passing "$PG_SYSTEM_ID-keyid" for > each (eg: $PG_SYSTEM_ID-data, $PG_SYSTEM_ID-wal), and getting back > the keys on stdout to use. > > When PG starts, it sees that the enveloped keys file doesn't exist and > does the same as initdb did- calls cluster_key_command multiple times to > get the keys which are needed. We'd want to make sure to have a way > early on that checks that the data DEK provided actually decrypts the > cluster, and bail out otherwise, before actually writing any data with > that key. I'll note though that this approach would actually allow you > to change the WAL key, if you wanted to, though, which could certainly > be nice (naturally there would be various caveats about doing so and > that replicas would have to also be switched to the new key, etc, but > that all seems reasonably solvable). Having a stand-alone utility that > could do that for the --pg-subkeys case would be useful too (and just > generally decrypt it for viewing/backup/replacement/etc). > > Down the line this might even allow us to do an online re-keying, at > least once the challenges around enabling online data checksums are > sorted out, but I don't think that's something to worry about today. > Still, it seems like this potentially provides a pretty clean interface > for that to happen eventually. > Changing the WAL key independently of the local storage key is the big operational advantage I see of managing the two keys separately. It lays the basis for a full key rotation story. Rotating WAL keys during switchover between a pair with the new key applicable from a certain timeline number maye be close enough to online to satisfy most requirements. Not something to worry about today, as you say. > > Right, I think we can agree on this aspect and I've chatted with Bruce > about it some also. When a direct key can be provided, we should use > that, and not run it through a KDF. This seems like a particularly > important case that we should care about even at this early stage. > Thanks for following it up. Thanks, Alastair
pgsql-hackers by date: