Re: Proposed patch for key managment - Mailing list pgsql-hackers

From Alastair Turner
Subject Re: Proposed patch for key managment
Date
Msg-id CAC0GmyyVHstcMefpbCqXC25_Xg4V=Dy-q1ps1a06j9xUOL0SzQ@mail.gmail.com
Whole thread Raw
In response to Re: Proposed patch for key managment  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Proposed patch for key managment
List pgsql-hackers
On Wed, 16 Dec 2020 at 22:43, Stephen Frost <sfrost@snowman.net> wrote:
>
> Greetings,
...
>
> If I'm following, you're suggesting something like:
>
> cluster_passphrase_command = 'aws get %q'
>
> and then '%q' gets replaced with "Please provide the WAL DEK: ", or
> something like that?  Prompting the user for each key?  Not sure how
> well that's work if want to automate everything though.
>
> If you have specific ideas, it'd really be helpful to give examples of
> what you're thinking.

I can think of three specific ideas off the top of my head: the
passphrase key wrapper, the secret store and the cloud/HW KMS.

Since the examples expand the purpose of cluster_passphrase_command,
let's call it cluster_key_challenge_command in the examples.

Starting with the passphrase key wrapper, since it's what's in place now.

 - cluster_key_challenge_command = 'password_key_wrapper %q'
 - Supplied on stdin to the process is the wrapped DEK (either a
combined key for db files and WAL or one for each, on separate calls)
 - %q is "Please provide WAL key wrapper password" or just "...provide
key wrapper password"
 - Returned on stdout is the unwrapped DEK

For a secret store

 - cluster_key_challenge_command = 'vault_key_fetch'
 - Supplied on stdin to the process is the secret's identifier (pg_dek_xxUUIDxx)
 - Returned on stdout is the DEK, which may never have been wrapped,
depending on implementation
 - Access control to the secret store is managed through the challenge
command's own config, certs, HBA, ...

For an HSM or cloud KMS

 - cluster_key_challenge_command = 'gcp_kms_key_fetch'
 - Supplied on stdin to the process is the the wrapped DEK (individual
or combined)
 - Returned on stdout is the DEK (individual or combined)
 - Access control to the kms is managed through the challenge
command's own config, certs, HBA, ...

The secret store and HSM/KMS options may be promptless, and therefore
amenable to automation, depending on the setup of those clients.

>
...
>
> > > ...That
> > > avoids the complication of having to have an API that somehow provides
> > > more than one key, while also using the primary DEK key as-is from the
> > > key management service and the KEK never being seen on the system where
> > > PG is running.
> >
> > Other than calling out (and therefore potentially prompting) twice,
> > what do you see as the complications of having more than one key?
>
> Mainly just a concern about the API and about what happens if, say, we
> decide that we want to have another sub-key, for example.  If we're
> handling them then there's really no issue- we just add another key in,
> but if that's not happening then it's going to mean changes for
> administrators.  If there's a good justification for it, then perhaps
> that's alright, but hand waving at what the issue is doesn't really
> help.
>

Sorry, I wasn't trying to hand wave it away. For automated
interactions, like big iron HSMs or cloud KSMs, the difference between
2 operations and 10 operations to start a DB server won't matter. For
an admin/operator having to type 10 passwords or get 10 clean
thumbprint scans, it would be horrible. My underlying question was, is
that toil the only problem to be solved, or is there another reason to
get into key combination, key splitting and the related issues which
are less documented and less well understood than key wrapping.

>
...
> >
> > I'd describe what the current patch does as using YubiKey to encrypt
> > and then decrypt an intermediate secret, which is then used to
> > generate/derive a KEK, which is then used to unwrap the stored,
> > wrapped DEK.
>
> This seems like a crux of at least one concern- that the current patch
> is deriving the actual KEK from the passphrase and not just taking the
> provided value (at least, that's what it looks like from a *very* quick
> look into it), and that's the part that I was suggesting that we might
> add an option for- to indicate if the cluster passphrase command is
> actually returning a passphrase which should be used to derive a key, or
> if it's returning a key directly to be used.  That does seem to be a
> material difference to me and one which we should care about.
>

Yes. Caring about that is the reason I've been making a nuisance of myself.

> > > There's an entirely independent discussion to be had about figuring out
> > > a way to actually off-load *entirely* the encryption/decryption of data
> > > to the linux crypto system or hardware devices, but unless someone can
> > > step up and write code for those today, I'd suggest that we table that
> > > effort until after we get this initial capability of having TDE with PG
> > > doing all of the encryption/decryption.
> >
> > I'm hopeful that the work on abstracting OpenSSL, nsstls, etc is going
> > to help in this direct.
>
> Yes, I agree with that general idea but it's a 'next step' kind of
> thing, not something we need to try and bake in today.
>

Agreed.

Thanks
Alastair



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: range_agg
Next
From: Daniel Gustafsson
Date:
Subject: Re: Proposed patch for key managment