Re: [PATCHES] Post-special page storage TDE support - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: [PATCHES] Post-special page storage TDE support
Date
Msg-id CAOuzzgoiXPTCLmm8O716ek4D3aaeA_HeSbBDRMpaqU=2Z-+=jQ@mail.gmail.com
Whole thread Raw
In response to Re: [PATCHES] Post-special page storage TDE support  (David Christensen <david.christensen@crunchydata.com>)
List pgsql-hackers
Greetings,

On Wed, Nov 8, 2023 at 20:55 David Christensen <david.christensen@crunchydata.com> wrote:
On Wed, Nov 8, 2023 at 8:04 AM Stephen Frost <sfrost@snowman.net> wrote:
* Andres Freund (andres@anarazel.de) wrote:
> On 2023-05-09 17:08:26 -0500, David Christensen wrote:
> > From 965309ea3517fa734c4bc89c144e2031cdf6c0c3 Mon Sep 17 00:00:00 2001
> > From: David Christensen <david@pgguru.net>
> > Date: Tue, 9 May 2023 16:56:15 -0500
> > Subject: [PATCH v4 1/3] Add reserved_page_space to Page structure
> >
> > This space is reserved for extended data on the Page structure which will be ultimately used for
> > encrypted data, extended checksums, and potentially other things.  This data appears at the end of
> > the Page, after any `pd_special` area, and will be calculated at runtime based on specific
> > ControlFile features.
> >
> > No effort is made to ensure this is backwards-compatible with existing clusters for `pg_upgrade`, as
> > we will require logical replication to move data into a cluster with
> > different settings here.
>
> The first part of the last paragraph makes it sound like pg_upgrade won't be
> supported across this commit, rather than just between different settings...

Yeah, that's vague, but you picked up on what I meant.
 
> I think as a whole this is not an insane idea. A few comments:

Thanks for all the feedback!

> - Why is it worth sacrificing space on every page to indicate which features
>   were enabled?  I think there'd need to be some convincing reasons for
>   introducing such overhead.

In conversations with folks (my memory specifically is a discussion with
Peter G, added to CC, and my apologies to Peter if I'm misremembering)
there was a pretty strong push that a page should be able to 'stand
alone' and not depend on something else (eg: pg_control, or whatever) to
provide info needed be able to interpret the page.  For my part, I don't
have a particularly strong feeling on that, but that's what lead to this
design.

Unsurprisingly, I agree that it's useful to keep these features on the page itself; from a forensic standpoint this seems much easier to interpret what is happening here, as well it would allow you to have different features on a given page or type of page depending on need.  The initial patch utilizes pg_control to store the cluster page features, but there's no reason it couldn't be dependent on fork/page type or stored in pg_tablespace to utilize different features.

When it comes to authenticated encryption, it’s also the case that it’s unclear what value the checksum field has, if any…  it’s certainly not directly needed as a checksum, as the auth tag is much better for the purpose of seeing if the page has been changed in some way. It’s also not big enough to serve as an auth tag per NIST guidelines regarding the size of the authenticated data vs. the size of the tag.  Using it to indicate what features are enabled on the page seems pretty useful, as David notes.

Thanks,

Stephen

pgsql-hackers by date:

Previous
From: Crisp Lee
Date:
Subject: Re: make pg_ctl more friendly
Next
From: Peter Smith
Date:
Subject: Re: pg_upgrade and logical replication