Re: Key management with tests - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Key management with tests
Date
Msg-id 20210118232119.GB29673@momjian.us
Whole thread Raw
In response to Re: Key management with tests  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Jan 18, 2021 at 04:38:47PM -0500, Robert Haas wrote:
> To me, it wouldn't make sense to commit a full README for a TDE
> feature that we don't have yet with a key management patch, but the
> way that they'll interact with each other has to be clear. The
> doc/database-encryption.sgml file that Bruce included in the patch is
> a decent start on explaining the design, though I think it needs more
> work and more details, perhaps including some of the things Andres
> mentioned.

Sure.

> To be honest, after reading over that SGML documentation a bit, I'm
> somewhat skeptical about whether it really makes sense to think about
> committing the key management part separately. It seems to have no use
> independent of the main feature, and it in fact embeds very specific

For usefulness, it does enable passphrase prompting for the TLS private
key.

> details of how the main feature is expected to work. For example, the
> documentation says that key #0 will be used for data files, and key #1
> for WAL. There seems to be no suggestion that the key management
> portion of this can be used to manage encryption keys generally for
> whatever purposes someone might have; it's all about the needs of a
> particular TDE implementation. Typically, we would not commit

We originally were going to have SQL-level keys, but many felt they
weren't useful.

> something like that separately, or only once the main patch was done,
> with the two commits occurring in a relatively short time period.
> Otherwise, as Bruce already noted, we can end up with something that
> is documented and visible to users but doesn't actually work yet.

Yep, that is the risk.

> Some more specific comments on data-encryption.sgml:
> 
> * The documentation explains that the purpose of having a WAL key
> separate from the data file key is so that the data file keys can
> "eventually" be rotated. It's not clear whether this means that we
> might eventually have that feature or that we might eventually be able
> to rotate, after failing over. If this kind of thing is possible,
> we'll eventually need documentation on how to do it.

I have clarified that saying "future release".

> * The reasons for use a DEK and a KEK are not explained. I realize
> it's not an uncommon practice and that other systems do it, but I
> think a few sentences of explanation wouldn't be a bad idea. Even if
> we are supposing that hackers who want to have input into this feature
> have to be knowledgeable about cryptography, I don't think we can
> reasonably suppose that for users.

I added a little about that in the docs.

> * "For example" is at one point followed by a period rather than a
> colon or comma.

Fixed.

> * In the "Internals" subsection, the last sentence doesn't seem to be
> grammatical. I wonder if it's missing the word "or"'.

Fixed.

> * The part about integrity-checking keys on startup isn't clear. It
> makes it sound like we still have a copy of the KEK lying around
> someplace against which we can compare, which I assume is not the case
> since it would be really insecure.

I rewored that entire section.  See if it is better now.

> * I think it's going to be pretty important that we can easily switch
> to other cryptographic algorithms as they are discovered, so I don't
> like the fact that this is tied specifically to AES. (In fact,
> kmgr_utils.h makes it sound like we're specifically locked into
> AES256, but that contradicts the documentation, so I guess there's
> some clarification needed here about what exactly KMGR_CLUSTER_KEY_LEN
> is doing.) As far as possible we should try to make this generic, like
> supporting any cipher that SSL has which has property X. It seems
> relatively inevitable that every currently popular cryptographic
> algorithm will at some point in the future be judged weak and
> worthless, just as has already happened with MD5 and some variants of
> SHA, both of which used to be considered state of the art. It seems
> equally inevitable that new and stronger algorithms will continued to
> be devised, and we'll want to adopt those easily.

That is a nifty idea.  Right now I just pass the integer length around,
and store it in pg_control, but if we define macros, we can easily
abstract this and easily allow for new methods.  If others like that, I
will start on it now.

> I'm not sure to what extent this a serious flaw in the patch and to
> what extent it's just a matter of tweaking the wording of some things,
> but I think this is actually an extremely critical design point where
> we had better be certain we've got it right. Few things would be
> sadder than to get a TDE feature and then have to rip it out again
> because it couldn't be upgraded to work with newer crypto algorithms
> with reasonable effort.

Yep.

> Notes on other parts of the documentation:
> 
> * The documentation for initdb -K doesn't list the valid values of the
> parameter, only the default. Probably we should be specifying an

Fixed.

> algorithm here and not just a bit count. Otherwise, like I say above,
> what happens when AES gives way to something else? It'd be easy to say
> -K BFT256 instead of -K AES256, but if AES is assumed and it's no
> longer what we want them we have problems. This kind of thing probably
> needs to be cleaned up in a bunch of places.

Again, I can do that if people like it.

> * I don't see the point of saying "a passphrase or PIN." We don't need
> to document that your passphrase might happen to only contain digits.

Well, PIN is what the Yubikey and PIV devices call it, so I thought I
should give specific examples of inputs.

> * pg_alterckey's description of "repair" is hard to understand. It
> doesn't really explain why or how this would be necessary, and it begs
> the question of why we'd ever leave things in a state that requires
> repair. This is sketched out in code comments elsewhere, but I think
> at least some of it needs to be explained in the documentation as
> well. (Incidentally, I don't think the comments at the top of
> recover_failure will survive a visit from pgindent, though I might be
> wrong about that.)

Fixed with rewording.  Better?

> * The changes to config.sgml say "Sample script" instead of "Sample scripts".

Fixed.

> * I don't think that the documentation of %R is very clear, or
> adequate for someone to make effective use of it. If I wanted to use
> %R, how would I ensure that a value is available?

Fixed, use -R on server start.

> * The changes to allfiles.sgml add pg_alterckey.sgml in the wrong
> place and include an incorrect whitespace change.

Uh, the whitespace change was to align the column.  I will review and
push that separately.

> * It's odd that "pg_alterckey" describes itself as "technically"
> changing the KEK. Isn't that just what it does, not a technicality? I
> imagine we'll ultimately need a way to change a DEK as well, because
> otherwise the use of a separate key for the WAL wouldn't accomplish
> the intended goal.

"technically" removed.  I kind of wanted to say "in detail" or something
like that, but removing the word is fine.  Change-only patch attached so
you can see the changes more easily.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee


Attachment

pgsql-hackers by date:

Previous
From: japin
Date:
Subject: Re: NOT VALID for Unique Indexes
Next
From: David Fetter
Date:
Subject: Re: popcount