Re: Moving forward with TDE [PATCH v3] - Mailing list pgsql-hackers

From David Christensen
Subject Re: Moving forward with TDE [PATCH v3]
Date
Msg-id CAOxo6XJY-F3mh=epBHLSyb9befM6B0PhgxCs5KO9X3C99YAQhQ@mail.gmail.com
Whole thread Raw
In response to Re: Moving forward with TDE [PATCH v3]  (Andres Freund <andres@anarazel.de>)
Responses Re: Moving forward with TDE
List pgsql-hackers
On Tue, Nov 7, 2023 at 5:49 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2023-11-06 09:56:37 -0500, Stephen Frost wrote:
> * Andres Freund (andres@anarazel.de) wrote:
> > I still am quite quite unconvinced that using the LSN as a nonce is a good
> > design decision.
>
> This is a really important part of the overall path to moving this
> forward, so I wanted to jump to it and have a specific discussion
> around this.  I agree that there are downsides to using the LSN, some of
> which we could possibly address (eg: include the timeline ID in the IV),
> but others that would be harder to deal with.

> The question then is- what's the alternative?
>
> One approach would be to change the page format to include space for an
> explicit nonce.  I don't see the community accepting such a new page
> format as the only format we support though as that would mean no
> pg_upgrade support along with wasted space if TDE isn't being used.

Right.

Hmm, if we /were/ to introduce some sort of page format change, Couldn't that be a use case for modifying the pd_version field?  Could v4 pages be read in and written out as v5 pages with different interpretations?
 
> Ideally, we'd instead be able to support multiple page formats where
> users could decide when they create their cluster what features they
> want- and luckily we even have such an effort underway with patches
> posted for review [1].

I think there are some details wrong with that patch - IMO the existing macros
should just continue to work as-is and instead the places that want the more
narrow definition should be moved to the new macros and it changes places that
should continue to use compile time constants - but it doesn't seem like a
fundamentally bad idea to me.  I certainly like it much better than making the
page size runtime configurable.

There had been some discussion about this WRT renaming macros and the like (constants, etc)—I think a new pass eliminating the variable blocksize pieces and seeing if we can minimize churn here is worthwhile, will take a look and see what the minimally-viable set of changes is here.
 
(I'll try to reply with the above points to [1])


> Certainly, with the base page-special-feature patch, we could have an option
> for users to choose that they want a better nonce than the LSN, or we could
> bundle that assumption in with, say, the authenticated-encryption feature
> (if you want authenticated encryption, then you want more from the
> encryption system than the basics, and therefore we presume you also want a
> better nonce than the LSN).

I don't think we should support using the LSN as a nonce if we have an
alternative. The cost and complexity overhead is just not worth it.  Yes,
it'll be harder for users to migrate to encryption, but adding complexity
elsewhere in the system to get an inferior result isn't worth it.

From my read, XTS (which I'd see as inferior to authenticated encryption, but better than some other options) could use LSN as an IV without leakage concerns, perhaps mixing in the BlockNumber as well.  If we are going to allow multiple encryption types, I think we may need to consider that needs for IVs may be different, so this may need to be something that is selectable per encryption type.  

I am unclear how much of a requirement this is, but seems like having a design supporting this to be pluggable—even if a static lookup table internally for encryption type, block length, IV source, etc—seems the most future proof if we had to retire an encryption method or prevent creation of specific methods, say.
 
> Another approach would be a separate fork, but that then has a number of
> downsides too- every write has to touch that too, and a single page of
> nonces would cover a pretty large number of pages also.

Yea, the costs of doing so is nontrivial. If you were trying to implement
encryption on the smgr level - which I doubt you should but am not certain
about - my suggestion would be to interleave pages with metadata like nonces
and AEAD with the data pages. I.e. one metadata page would be followed by
  (BLCKSZ - SizeOfPageHeaderData) / (sizeof(nonce) + sizeof(AEAD))
pages containing actual relation data.  That way you still get decent locality
during scans and writes.

Hmm, this is actually an interesting idea, I will think about this a bit.
 
Relation forks were a mistake, we shouldn't use them in more places.


I think it'd be much better if we also encrypted forks, rather than just the
main fork...

I believe the existing code should just work by modifying the PageNeedsToBeEncrypted macro; I will test that and see if anything blows up.

David

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_upgrade and logical replication
Next
From: Michael Paquier
Date:
Subject: Re: Remove MSVC scripts from the tree