Re: [PATCHES] Post-special page storage TDE support - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PATCHES] Post-special page storage TDE support
Date
Msg-id 20231108002011.2c7amddaul7dhkbd@awork3.anarazel.de
Whole thread Raw
In response to Re: [PATCHES] Post-special page storage TDE support  (David Christensen <david.christensen@crunchydata.com>)
Responses Re: [PATCHES] Post-special page storage TDE support  (Stephen Frost <sfrost@snowman.net>)
Re: [PATCHES] Post-special page storage TDE support  (David Christensen <david.christensen@crunchydata.com>)
Re: [PATCHES] Post-special page storage TDE support  (David Christensen <david.christensen@crunchydata.com>)
List pgsql-hackers
Hi,

On 2023-05-09 17:08:26 -0500, David Christensen wrote:
> From 965309ea3517fa734c4bc89c144e2031cdf6c0c3 Mon Sep 17 00:00:00 2001
> From: David Christensen <david@pgguru.net>
> Date: Tue, 9 May 2023 16:56:15 -0500
> Subject: [PATCH v4 1/3] Add reserved_page_space to Page structure
>
> This space is reserved for extended data on the Page structure which will be ultimately used for
> encrypted data, extended checksums, and potentially other things.  This data appears at the end of
> the Page, after any `pd_special` area, and will be calculated at runtime based on specific
> ControlFile features.
>
> No effort is made to ensure this is backwards-compatible with existing clusters for `pg_upgrade`, as
> we will require logical replication to move data into a cluster with
> different settings here.

The first part of the last paragraph makes it sound like pg_upgrade won't be
supported across this commit, rather than just between different settings...

I think as a whole this is not an insane idea. A few comments:

- IMO the patch touches many places it shouldn't need to touch, because of
  essentially renaming a lot of existing macro names to *Limit,
  necessitating modifying a lot of users. I think instead the few places that
  care about the runtime limit should be modified.

  As-is the patch would cause a lot of fallout in extensions that just do
  things like defining an on-stack array of Datums or such - even though all
  they'd need is to change the define to the *Limit one.

  Even leaving extensions aside, it must makes reviewing (and I'm sure
  maintaining) the patch very tedious.


- I'm a bit worried about how the extra special page will be managed - if
  there are multiple features that want to use it, who gets to put their data
  at what offset?

  After writing this I saw that 0002 tries to address this - but I don't like
  the design. It introduces runtime overhead that seems likely to be visible.


- Checking for features using PageGetFeatureOffset() seems the wrong design to
  me - instead of a branch for some feature being disabled, perfectly
  predictable for the CPU, we need to do an external function call every time
  to figure out that yet, checksums are *still* disabled.


- Recomputing offsets every time in PageGetFeatureOffset() seems too
  expensive. The offsets can't change while running as PageGetFeatureOffset()
  have enough information to distinguish between different kinds of relations
  - so why do we need to recompute offsets on every single page?  I'd instead
  add a distinct offset variable for each feature.


- Modifying every single PageInit() call doesn't make sense to me. That'll
  just create a lot of breakage for - as far as I can tell - no win.


- Why is it worth sacrificing space on every page to indicate which features
  were enabled?  I think there'd need to be some convincing reasons for
  introducing such overhead.

- Is it really useful to encode the set of features enabled in a cluster with
  a bitmask? That pretty much precludes utilizing extra page space in
  extensions. We could instead just have an extra cluster-wide file that
  defines a mapping of offset to feature.


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Cleaning up array_in()
Next
From: Michael Paquier
Date:
Subject: Re: Force the old transactions logs cleanup even if checkpoint is skipped