Re: Key management with tests - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Key management with tests
Date
Msg-id CA+TgmoaiKL6q4mAXtJWjAOt-NGyF39_q08ugRk9HXdvsoprPnQ@mail.gmail.com
Whole thread Raw
In response to Re: Key management with tests  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Key management with tests  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Tue, Jan 26, 2021 at 11:15 AM Bruce Momjian <bruce@momjian.us> wrote:
> This version fixes OpenSSL detection and improves docs for initdb
> interactions.

Hi,

I'm wondering whether you've considered storing all the keys in one
file instead of a file per key. The reason I ask is that it seems to
me that the key rotation procedure would be a lot simpler if it were
all in one file. You could just create a temporary file and atomically
rename it over the existing file. If you see a temporary file you're
always free to remove it. This is a lot simpler than what you have
right now. The "repair" concept pretty much goes away completely,
which seems nice. Granted I don't know exactly how to store multiple
keys in one file, but I bet there's some way to do it.

The way in which you are posting these patches is quite unlike what
most people do when posting patches to this list. You seem to have
generated a bunch of patches using 'git format-patch' but then
concatenated them all together in a single file. It would be helpful
if you could do this more like the way that is now standard on this
list. Not only that, but the patches don't have meaningful commit
messages in them, and don't seem to be meaningfully split for easy
review. They just say things like 'crypto squash commit'. Compare this
to for example what I did on the "cleaning up a few CLOG-related
things" thread where the commits appear in a logical sequence, and
each one has a meaningful commit message. Or here's an example from
someone else --
http://postgr.es/m/be72abfa-e62e-eb81-4e70-1b57fe6dc9e2@amazon.com --
and note the inclusion of authorship information in the commit
messages, so that the source of the code can be easily understood.

The README in src/backend/crypto does not explain how the scripts in
that directory are intended to be used. If I want to use AWS Secrets
Manager with this feature, I can see that I should use
ckey_aws.sh.sample as a basis for that integration, but I don't know
what I should do with the script because the README says nothing about
it. I am frankly pretty doubtful about the idea of shipping a bunch of
/bin/sh scripts as a best practice; for one thing, that's totally
unusable on Windows, and it also means that this is dependent on
/bin/sh existing and having the behavior we expect and on all the
other executables in these scripts as well. But, at the very least,
there needs to be a clearer explanation of how the scripts are
intended to be used, which parts people are supposed to modify, what
arguments they're going to get called with, and things like that.

The comments in cipher.c and cipher_openssl.c could be improved to
explain that they are alternatives to each other. Perhaps the former
could be renamed to something like cipher_failure.c or cipher_noimpl.c
for clarity.

I believe that a StaticAssertStmt could be used to check the length of
the encryption_methods[] array, so that if someone changes
NUM_ENCRYPTION_METHODS without updating the array, compilation fails.
See UserAuthName[] for an example of how to do this.

You seem to have omitted to update the documentation with the names of
the new wait events that you added.

In process_postgres_switches(), when there's a multi-line comment
followed by a single line of actual code, I prefer to include braces
around the whole thing. There might be some disagreement on what is
best here, though.

What are the consequences of the placement of the code in
PostgresMain() for processes other than user backends and walsenders?
I think that the way you have it, background workers would not have
access to keys, nor auxiliary processes like the checkpointer ... at
least in the EXEC_BACKEND case. In the non-EXEC_BACKEND case you have
the postmaster doing it, so then I'm not sure why it has to be redone
for every backend. Won't they just inherit the data from the
postmaster? Has this code been meaningfully tested on Windows? How do
we know that it works? Maybe we need to think about adding some
asserts that guarantee that any process that attempts to access a
buffer has the key manager initialized; I bet such assertions would
fail at least on Windows as the code looks now.

I don't think it makes sense to think about committing this to v14. I
believe it only makes sense if we have a TDE patch that is relatively
close to committable that can be used with it. I also don't think that
this patch is in good enough shape to commit yet in terms of where
it's at in terms of quality; I think it needs more review first,
hopefully including review from people who can comment intelligently
specifically on the cryptography aspects of it. However, the
challenges don't seem insurmountable. There's also still some question
in my mind about whether the design choices here (one KEK, 2 DEKs, one
for data and one for WAL) have enough consensus. I don't have a
considered opinion on that, partly because I'm not quite sure what the
reasonable alternatives are, but it seems that other people had some
questions about it, IIUC.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: strange error reporting
Next
From: Justin Pryzby
Date:
Subject: Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal