Re: [PATCHES] Post-special page storage TDE support - Mailing list pgsql-hackers

From David Christensen
Subject Re: [PATCHES] Post-special page storage TDE support
Date
Msg-id CAOxo6XL9oL6oU-kAQAbXeDKVk-CEy2kwOvfw5qwaPALksBup+A@mail.gmail.com
Whole thread Raw
In response to Re: [PATCHES] Post-special page storage TDE support  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Fri, Dec 27, 2024 at 1:58 PM Bruce Momjian <bruce@momjian.us> wrote:
>
> On Fri, Dec 27, 2024 at 12:25:11PM -0500, Greg Sabino Mullane wrote:
> > On Fri, Dec 27, 2024 at 10:12 AM Bruce Momjian <bruce@momjian.us> wrote:
> >
> >     The value of TDE is limited from a security value perspective, but high on
> >     the list of security policy requirements.  Our community is much more
> >     responsive to actual value vs policy compliance value.
> >
> >
> > True. The number of forks, though, makes me feel this is a "when", not "if"
> > feature. Has there been any other complex feature forked/implemented by so
> > many? Maybe columnar storage?
>
> That is a great question.  We have TDE implementations from EDB,
> Fujitsu, Percona, Cybertec, and Crunchy Data, and perhaps others, and
> that is a lot of duplicated effort.
>
> As far as parallels, I think compatibility with Oracle and MSSQL are
> areas that several companies have developed that the community is
> unlikely to ever develop, I think because they are pure compatibility,
> not functionality.  I think TDE having primarily policy compliance value
> also might make it something the community never develops.
>
> I think this blog post is the clearest I have seen about the technical
> value vs.policy compliance value of TDE:
>
>         https://www.percona.com/blog/why-postgresql-needs-transparent-database-encryption-tde/
>
> One possible way TDE could be added to community Postgres is if the code
> changes required were reduced due to an API redesign.

A couple big pieces here could be modifying the API to add
PreparePageForWrite()/PreparePageFromRead() hooks to transform the
data page once read from disk or getting ready to write to disk.  I
think I have a (not yet rebased atop bulk read/write API and various
incremental backup pieces) patch version which basically refactors
things somewhat to support that, but basically making a single call
point that we can add things like checksums, page encryption, etc.

I think there was also a thread floating around moving various
arbitrary read/write file APIs (for temporary files) into a common
API; my recollection is there was something along the lines of 4 or 5
different file read abstractions we used at various pieces in the code
base, so making a common one that could also be hooked would give us
the ability to make a read/write transient file API that we could then
"do stuff" with.  (My recollection is we could support encryption and
compression, but don't have the thread in front of me.)

Some form of early init pluggability would be required, since we'd
need to support reading encrypted WAL before full startup is
accomplished.  This seems harder, at least given the bits I was
originally plugging into.

Obviously existing forks would want to support reading their existing
clusters, so not sure there is an all-in-one solution here.

Just some musing here...

David



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Connection limits/permissions, slotsync workers, etc
Next
From: Tomas Vondra
Date:
Subject: Re: PoC: history of recent vacuum/checkpoint runs (using new hooks)