Thread: Moving forward with TDE

Moving forward with TDE

From

David Christensen

Date:

24 October 2022, 16:29:19

Hi -hackers,

Working with Stephen, I am attempting to pick up some of the work that
was left off with TDE and the key management infrastructure.  I have
rebased Bruce's KMS/TDE patches as they existed on the
https://wiki.postgresql.org/wiki/Transparent_Data_Encryption wiki
page, which are enclosed in this email.

I would love to open a discussion about how to move forward and get
some of these features built out.  The historical threads here are
quite long and complicated; is there a "current state" other than the
wiki that reflects the general thinking on this feature?  Any major
developments in direction that would not be reflected in the code from
May 2021?

Thanks,

David

Attachment

Re: Moving forward with TDE

From

Aleksander Alekseev

Date:

03 November 2022, 14:09:00

Hi David,

> Working with Stephen, I am attempting to pick up some of the work that
> was left off with TDE and the key management infrastructure.  I have
> rebased Bruce's KMS/TDE patches as they existed on the
> https://wiki.postgresql.org/wiki/Transparent_Data_Encryption wiki
> page, which are enclosed in this email.

I'm happy to see that the TDE effort was picked up.

> I would love to open a discussion about how to move forward and get
> some of these features built out.  The historical threads here are
> quite long and complicated; is there a "current state" other than the
> wiki that reflects the general thinking on this feature?  Any major
> developments in direction that would not be reflected in the code from
> May 2021?

The patches seem to be well documented and decomposed in small pieces.
That's good.

Unless somebody in the community remembers open questions/issues with
TDE that were never addressed I suggest simply iterating with our
usual testing/reviewing process. For now I'm going to change the
status of the CF entry [1] to "Waiting for Author" since the patchset
doesn't pass the CI [2].

One limitation of the design described on the wiki I see is that it
seems to heavily rely on AES:

> We will use Advanced Encryption Standard (AES) [4]. We will offer three key length options (128, 192, and 256-bits)
selectedat initdb time with --file-encryption-method
 

(there doesn't seem to be any mention of the hash/MAC algorithms,
that's odd). In the future we should be able to add the support of
alternative algorithms. The reason is that the algorithms can become
weak every 20 years or so, and the preferred algorithms may also
depend on the region. This should NOT be implemented in this
particular patchset, but the design shouldn't prevent from
implementing this in the future.

[1]: https://commitfest.postgresql.org/40/3985/
[2]: http://cfbot.cputube.org/

-- 
Best regards,
Aleksander Alekseev

Re: Moving forward with TDE

From

David Christensen

Date:

03 November 2022, 22:06:23

> Unless somebody in the community remembers open questions/issues with
> TDE that were never addressed I suggest simply iterating with our
> usual testing/reviewing process. For now I'm going to change the
> status of the CF entry [1] to "Waiting for Author" since the patchset
> doesn't pass the CI [2].

Thanks, enclosed is a new version that is rebased on HEAD and fixes a
bug that the new pg_control_init() test picked up.

Known issues (just discovered by me in testing the latest revision) is
that databases created from `template0` are not decrypting properly,
but `template1` works fine, so going to dig in on that soon.

> One limitation of the design described on the wiki I see is that it
> seems to heavily rely on AES:
>
> > We will use Advanced Encryption Standard (AES) [4]. We will offer three key length options (128, 192, and 256-bits)
selectedat initdb time with --file-encryption-method
 
>
> (there doesn't seem to be any mention of the hash/MAC algorithms,
> that's odd). In the future we should be able to add the support of
> alternative algorithms. The reason is that the algorithms can become
> weak every 20 years or so, and the preferred algorithms may also
> depend on the region. This should NOT be implemented in this
> particular patchset, but the design shouldn't prevent from
> implementing this in the future.

Yes, we definitely are considering multiple algorithms support as part
of this effort.

Best,

David

Attachment

Re: Moving forward with TDE

From

Dilip Kumar

Date:

04 November 2022, 08:42:19

On Fri, Nov 4, 2022 at 3:36 AM David Christensen
<david.christensen@crunchydata.com> wrote:
>
> > Unless somebody in the community remembers open questions/issues with
> > TDE that were never addressed I suggest simply iterating with our
> > usual testing/reviewing process. For now I'm going to change the
> > status of the CF entry [1] to "Waiting for Author" since the patchset
> > doesn't pass the CI [2].
>
> Thanks, enclosed is a new version that is rebased on HEAD and fixes a
> bug that the new pg_control_init() test picked up.

I was looking into the documentation patches 0001 and 0002, I think
the explanation is very clear.  I have a few questions/comments

+By not using the database id in the IV, CREATE DATABASE can copy the
+heap/index files from the old database to a new one without
+decryption/encryption.  Both page copies are valid.  Once a database
+changes its pages, it gets new LSNs, and hence new IV.

How about the WAL_LOG method for creating a database? because in that
we get the new LSN for the pages in the new database, so do we
reencrypt, if yes then this documentation needs to be updated
otherwise we might need to add that code.

+changes its pages, it gets new LSNs, and hence new IV.  Using only the
+LSN and page number also avoids requiring pg_upgrade to preserve
+database oids, tablespace oids, and relfilenodes.

I think this line needs to be changed, because now we are already
preserving dbid/tbsid/relfilenode.  So even though we are not using
those in IV there is no point in saying we are avoiding that
requirement.

I will review the remaining patches soon.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Moving forward with TDE

From

Jacob Champion

Date:

15 November 2022, 19:07:54

On Mon, Oct 24, 2022 at 9:29 AM David Christensen
<david.christensen@crunchydata.com> wrote:
> I would love to open a discussion about how to move forward and get
> some of these features built out.  The historical threads here are
> quite long and complicated; is there a "current state" other than the
> wiki that reflects the general thinking on this feature?  Any major
> developments in direction that would not be reflected in the code from
> May 2021?

I don't think the patchset here has incorporated the results of the
discussion [1] that happened at the end of 2021. For example, it looks
like AES-CTR is still in use for the pages, which I thought was
already determined to be insufficient.

The following next steps were proposed in that thread:

> 1. modify temporary file I/O to use a more centralized API
> 2. modify the existing cluster file encryption patch to use XTS with a
>    IV that uses more than the LSN
> 3. add XTS regression test code like CTR
> 4. create WAL encryption code using CTR

Does this patchset need review before those steps are taken (or was
there additional conversation/work that I missed)?

Thanks,
--Jacob

[1] https://www.postgresql.org/message-id/flat/20211013222648.GA373%40momjian.us

Re: Moving forward with TDE

From

David Christensen

Date:

15 November 2022, 19:39:27

> On Nov 15, 2022, at 1:08 PM, Jacob Champion <jchampion@timescale.com> wrote:
>
> On Mon, Oct 24, 2022 at 9:29 AM David Christensen
> <david.christensen@crunchydata.com> wrote:
>> I would love to open a discussion about how to move forward and get
>> some of these features built out.  The historical threads here are
>> quite long and complicated; is there a "current state" other than the
>> wiki that reflects the general thinking on this feature?  Any major
>> developments in direction that would not be reflected in the code from
>> May 2021?
>
> I don't think the patchset here has incorporated the results of the
> discussion [1] that happened at the end of 2021. For example, it looks
> like AES-CTR is still in use for the pages, which I thought was
> already determined to be insufficient.

Good to know about the next steps, thanks.

> The following next steps were proposed in that thread:
>
>> 1. modify temporary file I/O to use a more centralized API
>> 2. modify the existing cluster file encryption patch to use XTS with a
>>   IV that uses more than the LSN
>> 3. add XTS regression test code like CTR
>> 4. create WAL encryption code using CTR
>
> Does this patchset need review before those steps are taken (or was
> there additional conversation/work that I missed)?

This was just a refresh of the old patches on the wiki to work as written on HEAD. If there are known TODOs here this
thenthat work is still needing to be done.  

I was going to take 2) and Stephen was going to work on 3); I am not sure about the other two but will review the
threadyou pointed to. Thanks for pointing that out.  

David

Re: Moving forward with TDE

From

David Christensen

Date:

17 November 2022, 16:02:05

Hi Jacob,

Thanks, I've added this patch in my tree [1]. (For now, just adding
fixes and the like atop the original separate patches, but will
eventually get things winnowed down into probably the same 12 parts
the originals were reviewed in.

Best,

David

[1] https://github.com/pgguru/postgres/tree/tde

Re: Moving forward with TDE

From

David Christensen

Date:

17 November 2022, 16:34:48

Hi Dilip,

Thanks for the feedback here. I will review the docs changes and add to my tree.

Best,

David

Re: Moving forward with TDE

From

vignesh C

Date:

06 January 2023, 06:27:19

On Fri, 4 Nov 2022 at 03:36, David Christensen
<david.christensen@crunchydata.com> wrote:
>
> > Unless somebody in the community remembers open questions/issues with
> > TDE that were never addressed I suggest simply iterating with our
> > usual testing/reviewing process. For now I'm going to change the
> > status of the CF entry [1] to "Waiting for Author" since the patchset
> > doesn't pass the CI [2].
>
> Thanks, enclosed is a new version that is rebased on HEAD and fixes a
> bug that the new pg_control_init() test picked up.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
b82557ecc2ebbf649142740a1c5ce8d19089f620 ===
=== applying patch
./v2-0004-cfe-04-common_over_cfe-03-scripts-squash-commit.patch
patching file src/common/Makefile
Hunk #2 FAILED at 84.
1 out of 2 hunks FAILED -- saving rejects to file src/common/Makefile.rej

[1] - http://cfbot.cputube.org/patch_41_3985.log

Regards,
Vignesh

Re: Moving forward with TDE

From

Chris Travers

Date:

07 March 2023, 03:07:16

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested

I have decided to write a review here in terms of whether we want this feature, and perhaps the way we should look at
encryptionas a project down the road, since I think this is only the beginning. I am hoping to run some full tests of
thefeature sometime in coming weeks. Right now this review is limited to the documentation and documented feature.

From the documentation, the primary threat model of TDE is to prevent decryption of data from archived wal segments
(anddata files), for example on a backup system. While there are other methods around this problem to date, I think
thatthis feature is worth pursuing for that reason. I want to address a couple of reasons for this and then go into
somereservations I have about how some of this is documented.

There are current workarounds to ensuring encryption at rest, but these have a number of problems. Encryption
passphrasesend up lying around the system in various places. Key rotation is often difficult. And one mistake can
easilyrender all efforts ineffective. TDE solves these problems. The overall design from the internal docs looks
solid. This definitely is something I would recommend for many users.

I have a couple small caveats though. Encryption of data is a large topic and there isn't a one-size-fits-all solution
toindustrial or state requirements. Having all this key management available in PostgreSQL is a very good thing. Long
runit is likely to end up being extensible, and therefore both more powerful and offering a wider range of choices for
solutionarchitects. Implementing encryption is also something that is easy to mess up. For this reason I think it
wouldbe great if we had a standardized format for discussing encryption options that we could use going forward. I
don'tthink that should be held against this patch but I think we need to start discussing it now because it will be a
biggerproblem later.

A second caveat I have is that key management is a topic where you really need a good overview of internals in order to
implementeffectively. If you don't know how an SSL handshake works or what is in a certificate, you can easily make
mistakesin setting up SSL. I can see the same thing happening here. For example, I don't think it would be safe to
leavethe KEK on an encrypted filesystem that is decrypted at runtime (or at least I wouldn't consider that safe -- your
appetitefor risk may vary).

My proposal would be to have build a template for encryption options in the documentation. This could include topics
likeSSL as well. In such a template we'd have sections like "Threat model," "How it works," "Implementation
Requirements"and so forth. Again I don't think this needs to be part of the current patch but I think it is something
weneed to start thinking about now. Maybe after this goes in, I can present a proposed documentation patch.

I will also note that I don't consider myself to be very qualified on topics like encryption. I can reason about key
managementto some extent but some implementation details may be beyond me. I would hope we could get some extra review
onthis patch set soon.

Re: Moving forward with TDE

From

Stephen Frost

Date:

08 March 2023, 21:25:04

Greetings,

* Chris Travers (chris.travers@gmail.com) wrote:
> From the documentation, the primary threat model of TDE is to prevent decryption of data from archived wal segments
(anddata files), for example on a backup system. While there are other methods around this problem to date, I think
thatthis feature is worth pursuing for that reason. I want to address a couple of reasons for this and then go into
somereservations I have about how some of this is documented.

Agreed, though the latest efforts include an option for *authenticated*
encryption as well as unauthenticated. That makes it much more
difficult to make undetected changes to the data that's protected by
the authenticated encryption being used.

> There are current workarounds to ensuring encryption at rest, but these have a number of problems. Encryption
passphrasesend up lying around the system in various places. Key rotation is often difficult. And one mistake can
easilyrender all efforts ineffective. TDE solves these problems. The overall design from the internal docs looks
solid. This definitely is something I would recommend for many users.

There's clearly user demand for it as there's a number of organizations
who have forks which are providing it in one shape or another. This
kind of splintering of the community is actually an actively bad thing
for the project and is part of what killed Unix, by at least some pretty
reputable accounts, in my view.

> I have a couple small caveats though. Encryption of data is a large topic and there isn't a one-size-fits-all
solutionto industrial or state requirements. Having all this key management available in PostgreSQL is a very good
thing. Long run it is likely to end up being extensible, and therefore both more powerful and offering a wider range of
choicesfor solution architects. Implementing encryption is also something that is easy to mess up. For this reason I
thinkit would be great if we had a standardized format for discussing encryption options that we could use going
forward. I don't think that should be held against this patch but I think we need to start discussing it now because it
willbe a bigger problem later.

Do you have a suggestion as to the format to use?

> A second caveat I have is that key management is a topic where you really need a good overview of internals in order
toimplement effectively. If you don't know how an SSL handshake works or what is in a certificate, you can easily make
mistakesin setting up SSL. I can see the same thing happening here. For example, I don't think it would be safe to
leavethe KEK on an encrypted filesystem that is decrypted at runtime (or at least I wouldn't consider that safe -- your
appetitefor risk may vary).

Agreed that we should document this and make clear that the KEK is
necessary for server start but absolutely should be kept as safe as
possible and certainly not stored on disk somewhere nearby the encrypted
cluster.

> My proposal would be to have build a template for encryption options in the documentation. This could include topics
likeSSL as well. In such a template we'd have sections like "Threat model," "How it works," "Implementation
Requirements"and so forth. Again I don't think this needs to be part of the current patch but I think it is something
weneed to start thinking about now. Maybe after this goes in, I can present a proposed documentation patch.

I'm not entirely sure that it makes sense to lump this and TLS in the
same place as they end up being rather independent at the end of the
day. If you have ideas for how to improve the documentation, I'd
certainly encourage you to go ahead and work on that and submit it as a
patch rather than waiting for this to actually land in core. Having
good and solid documentation is something that will help this get in,
after all, and to the extent that it's covering existing topics like
TLS, those could likely be included independently and that would be of
benefit to everyone.

> I will also note that I don't consider myself to be very qualified on topics like encryption. I can reason about key
managementto some extent but some implementation details may be beyond me. I would hope we could get some extra review
onthis patch set soon.

Certainly agree with you there though there's an overall trajectory of
patches involved in all of this that's a bit deep. The plan is to
discuss that at PGCon (On the Road to TDE) and at the PGCon
Unconference after. I certainly hope those interested will be there.
I'm also happy to have a call with anyone interested in this effort
independent of that, of course.

Thanks!

Stephen

Attachment

signature.asc

Re: Moving forward with TDE

From

Bruce Momjian

Date:

27 March 2023, 16:38:29

On Wed, Mar  8, 2023 at 04:25:04PM -0500, Stephen Frost wrote:
> Agreed, though the latest efforts include an option for *authenticated*
> encryption as well as unauthenticated.  That makes it much more
> difficult to make undetected changes to the data that's protected by
> the authenticated encryption being used.

I thought some more about this.  GCM-style authentication of encrypted
data has value because it assumes the two end points are secure but that
a malicious actor could modify data during transfer.  In the Postgres
case, it seems the two end points and the transfer are all in the same
place.  Therefore, it is unclear to me the value of using GCM-style
authentication because if the GCM-level can be modified, so can the end
points, and the encryption key exposed.

> There's clearly user demand for it as there's a number of organizations
> who have forks which are providing it in one shape or another.  This
> kind of splintering of the community is actually an actively bad thing
> for the project and is part of what killed Unix, by at least some pretty
> reputable accounts, in my view.

Yes, the number of commercial implementations of this is a concern.  Of
course, it is also possible that those commercial implementations are
meeting checkbox requirements rather than technical ones, and the
community has been hostile to check box-only features.

> Certainly agree with you there though there's an overall trajectory of
> patches involved in all of this that's a bit deep.  The plan is to
> discuss that at PGCon (On the Road to TDE) and at the PGCon
> Unconference after.  I certainly hope those interested will be there.
> I'm also happy to have a call with anyone interested in this effort
> independent of that, of course.

I will not be attending Ottawa.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: Moving forward with TDE

From

Stephen Frost

Date:

27 March 2023, 22:01:56

Greetings,

On Mon, Mar 27, 2023 at 12:38 Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Mar 8, 2023 at 04:25:04PM -0500, Stephen Frost wrote:
> Agreed, though the latest efforts include an option for *authenticated*
> encryption as well as unauthenticated. That makes it much more
> difficult to make undetected changes to the data that's protected by
> the authenticated encryption being used.

I thought some more about this. GCM-style authentication of encrypted
data has value because it assumes the two end points are secure but that
a malicious actor could modify data during transfer. In the Postgres
case, it seems the two end points and the transfer are all in the same
place. Therefore, it is unclear to me the value of using GCM-style
authentication because if the GCM-level can be modified, so can the end
points, and the encryption key exposed.

What are the two end points you are referring to and why don’t you feel there is an opportunity between them for a malicious actor to attack the system?

There are simpler cases to consider than an online attack on a single independent system where an attacker having access to modify the data in transit between PG and the storage would imply the attacker also having access to read keys out of PG’s memory.

As specific examples, consider:

An attack against the database system where the database server is shut down, or a backup, and the encryption key isn’t available on the system.

The backup system itself, not running as the PG user (an option supported by PG and at least pgbackrest) being compromised, thus allowing for injection of changes into a backup or into a restore.

The beginning of this discussion also very clearly had individuals voicing strong opinions that unauthenticated encryption methods were not acceptable as an end-state for PG due to the clear issue of there then being no protection against modification of data. The approach we are working towards provides both the unauthenticated option, which clearly has value to a large number of our collective user base considering the number of commercial implementations which have now arisen, and the authenticated solution which goes further and provides the level clearly expected of the PG community. This gets us a win-win situation.

> There's clearly user demand for it as there's a number of organizations
> who have forks which are providing it in one shape or another. This
> kind of splintering of the community is actually an actively bad thing
> for the project and is part of what killed Unix, by at least some pretty
> reputable accounts, in my view.

Yes, the number of commercial implementations of this is a concern. Of
course, it is also possible that those commercial implementations are
meeting checkbox requirements rather than technical ones, and the
community has been hostile to check box-only features.

I’ve grown weary of this argument as the other major piece of work it was routinely applied to was RLS and yet that has certainly been seen broadly as a beneficial feature with users clearly leveraging it and in more than some “checkbox” way.

Indeed, it’s similar also in that commercial implementations were done of RLS while there were arguments made about it being a checkbox feature which were used to discourage it from being implemented in core. Were it truly checkbox, I don’t feel we would have the regular and ongoing discussion about it on the lists that we do, nor see other tools built on top of PG which specifically leverage it. Perhaps there are truly checkbox features out there which we will never implement, but I’m (perhaps due to what my dad would call selective listening on my part, perhaps not) having trouble coming up with any presently. Features that exist in other systems that we don’t want? Certainly. We don’t characterize those as simply “checkbox” though. Perhaps that’s in part because we provide alternatives- but that’s not the case here. We have no comparable way to have this capability as part of the core system.

We, as a community, are clearly losing value by lack of this capability, if by no other measure than simply the numerous users of the commercial implementations feeling that they simply can’t use PG without this feature, for whatever their reasoning.

Thanks,

Stephen

Re: Moving forward with TDE

From

Bruce Momjian

Date:

27 March 2023, 22:16:59

On Tue, Mar 28, 2023 at 12:01:56AM +0200, Stephen Frost wrote:
> Greetings,
> 
> On Mon, Mar 27, 2023 at 12:38 Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Wed, Mar  8, 2023 at 04:25:04PM -0500, Stephen Frost wrote:
>     > Agreed, though the latest efforts include an option for *authenticated*
>     > encryption as well as unauthenticated.  That makes it much more
>     > difficult to make undetected changes to the data that's protected by
>     > the authenticated encryption being used.
> 
>     I thought some more about this.  GCM-style authentication of encrypted
>     data has value because it assumes the two end points are secure but that
>     a malicious actor could modify data during transfer.  In the Postgres
>     case, it seems the two end points and the transfer are all in the same
>     place.  Therefore, it is unclear to me the value of using GCM-style
>     authentication because if the GCM-level can be modified, so can the end
>     points, and the encryption key exposed.
> 
> 
> What are the two end points you are referring to and why don’t you feel there
> is an opportunity between them for a malicious actor to attack the system?

Uh, TLS can use GCM and in this case you assume the sender and receiver
are secure, no?

> There are simpler cases to consider than an online attack on a single
> independent system where an attacker having access to modify the data in
> transit between PG and the storage would imply the attacker also having access
> to read keys out of PG’s memory. 

I consider the operating system and its processes as much more of a
single entity than TLS over a network.

> As specific examples, consider:
> 
> An attack against the database system where the database server is shut down,
> or a backup, and  the encryption key isn’t available on the system.
> 
> The backup system itself, not running as the PG user (an option supported by PG
> and at least pgbackrest) being compromised, thus allowing for injection of
> changes into a backup or into a restore.

I then question why we are not adding encryption to pg_basebackup or
pgbackrest rather than the database system.

> The beginning of this discussion also very clearly had individuals voicing
> strong opinions that unauthenticated encryption methods were not acceptable as
> an end-state for PG due to the clear issue of there then being no protection
> against modification of data.  The approach we are working towards provides

What were the _technical_ reasons for those objections?

> both the unauthenticated option, which clearly has value to a large number of
> our collective user base considering the number of commercial implementations
> which have now arisen, and the authenticated solution which goes further and
> provides the level clearly expected of the PG community. This gets us a win-win
> situation.
> 
>     > There's clearly user demand for it as there's a number of organizations
>     > who have forks which are providing it in one shape or another.  This
>     > kind of splintering of the community is actually an actively bad thing
>     > for the project and is part of what killed Unix, by at least some pretty
>     > reputable accounts, in my view.
> 
>     Yes, the number of commercial implementations of this is a concern.  Of
>     course, it is also possible that those commercial implementations are
>     meeting checkbox requirements rather than technical ones, and the
>     community has been hostile to check box-only features.
> 
> 
> I’ve grown weary of this argument as the other major piece of work it was
> routinely applied to was RLS and yet that has certainly been seen broadly as a
> beneficial feature with users clearly leveraging it and in more than some
> “checkbox” way.

RLS has to overcome that objection, and I think it did, as was better
for doing that.

> We, as a community, are clearly losing value by lack of this capability, if by
> no other measure than simply the numerous users of the commercial
> implementations feeling that they simply can’t use PG without this feature, for
> whatever their reasoning.

That is true, but I go back to my concern over useful feature vs. check
box.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: Moving forward with TDE

From

Stephen Frost

Date:

27 March 2023, 22:57:42

Greetings,

On Mon, Mar 27, 2023 at 18:17 Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Mar 28, 2023 at 12:01:56AM +0200, Stephen Frost wrote:
> Greetings,
>
> On Mon, Mar 27, 2023 at 12:38 Bruce Momjian <bruce@momjian.us> wrote:
>
> On Wed, Mar 8, 2023 at 04:25:04PM -0500, Stephen Frost wrote:
> > Agreed, though the latest efforts include an option for *authenticated*
> > encryption as well as unauthenticated. That makes it much more
> > difficult to make undetected changes to the data that's protected by
> > the authenticated encryption being used.
>
> I thought some more about this. GCM-style authentication of encrypted
> data has value because it assumes the two end points are secure but that
> a malicious actor could modify data during transfer. In the Postgres
> case, it seems the two end points and the transfer are all in the same
> place. Therefore, it is unclear to me the value of using GCM-style
> authentication because if the GCM-level can be modified, so can the end
> points, and the encryption key exposed.
>
>
> What are the two end points you are referring to and why don’t you feel there
> is an opportunity between them for a malicious actor to attack the system?

Uh, TLS can use GCM and in this case you assume the sender and receiver
are secure, no?

TLS does use GCM.. pretty much exclusively as far as I can recall. So do a lot of other things though..

> There are simpler cases to consider than an online attack on a single
> independent system where an attacker having access to modify the data in
> transit between PG and the storage would imply the attacker also having access
> to read keys out of PG’s memory.

I consider the operating system and its processes as much more of a
single entity than TLS over a network.

This may be the case sometimes but there’s absolutely no shortage of other cases and it’s almost more the rule these days, that there is some kind of network between the OS processes and the storage- a SAN, an iSCSI network, NFS, are all quite common.

> As specific examples, consider:
>
> An attack against the database system where the database server is shut down,
> or a backup, and the encryption key isn’t available on the system.
>
> The backup system itself, not running as the PG user (an option supported by PG
> and at least pgbackrest) being compromised, thus allowing for injection of
> changes into a backup or into a restore.

I then question why we are not adding encryption to pg_basebackup or
pgbackrest rather than the database system.

Pgbackrest has encryption and authentication of it … but that doesn’t actually address the attack vector that I outlined. If the backup user is compromised then they can change the data before it gets to the storage. If the backup user is compromised then they have access to whatever key is used to encrypt and authenticate the backup and therefore can trivially manipulate the data.

Encryption of backups by the backup tool serves to protect the data after it leaves the backup system and is stored in cloud storage or in whatever format the repository takes. This is beneficial, particularly when the data itself offers no protection, but simply not the same.

> The beginning of this discussion also very clearly had individuals voicing
> strong opinions that unauthenticated encryption methods were not acceptable as
> an end-state for PG due to the clear issue of there then being no protection
> against modification of data. The approach we are working towards provides

What were the _technical_ reasons for those objections?

I believe largely the ones I’m bringing up here and which I outline above… I don’t mean to pretend that any of this is of my own independent construction. I don’t believe it is and my apologies if it came across that way.

> both the unauthenticated option, which clearly has value to a large number of
> our collective user base considering the number of commercial implementations
> which have now arisen, and the authenticated solution which goes further and
> provides the level clearly expected of the PG community. This gets us a win-win
> situation.
>
> > There's clearly user demand for it as there's a number of organizations
> > who have forks which are providing it in one shape or another. This
> > kind of splintering of the community is actually an actively bad thing
> > for the project and is part of what killed Unix, by at least some pretty
> > reputable accounts, in my view.
>
> Yes, the number of commercial implementations of this is a concern. Of
> course, it is also possible that those commercial implementations are
> meeting checkbox requirements rather than technical ones, and the
> community has been hostile to check box-only features.
>
>
> I’ve grown weary of this argument as the other major piece of work it was
> routinely applied to was RLS and yet that has certainly been seen broadly as a
> beneficial feature with users clearly leveraging it and in more than some
> “checkbox” way.

RLS has to overcome that objection, and I think it did, as was better
for doing that.

Beyond it being called a checkbox - what were the arguments against it? I don’t object to being challenged to point out the use cases, but I feel that at least some very clear and straight forward ones are outlined from what has been said above. I also don’t believe those are the only ones but I don’t think I could enumerate every use case for RLS either, even after seeing it used for quite a few years. I do seriously question the level of effort expected of features that are claimed to be “Checkbox” and tossed almost exclusively for that reason on this list given the success of the ones that have been accepted and are in active use by our users today.

> We, as a community, are clearly losing value by lack of this capability, if by
> no other measure than simply the numerous users of the commercial
> implementations feeling that they simply can’t use PG without this feature, for
> whatever their reasoning.

That is true, but I go back to my concern over useful feature vs. check
box.

While it’s easy to label something as checkbox, I don’t feel we have been fair to our users in doing so as it has historically prevented features which our users are demanding and end up getting from commercial providers until we implement them ultimately anyway. This particular argument simply doesn’t seem to actually hold the value that proponents of it claim, for us at least, and we have clear counter-examples which we can point to and I hope we learn from those.

Thanks!

Stephen

Re: Moving forward with TDE

From

Bruce Momjian

Date:

27 March 2023, 23:19:21

On Tue, Mar 28, 2023 at 12:57:42AM +0200, Stephen Frost wrote:
>     I consider the operating system and its processes as much more of a
>     single entity than TLS over a network.
> 
> This may be the case sometimes but there’s absolutely no shortage of other
> cases and it’s almost more the rule these days, that there is some kind of
> network between the OS processes and the storage- a SAN, an iSCSI network, NFS,
> are all quite common.

Yes, but consider that the database cluster is having to get its data
from that remote storage --- the remote storage is not an independent
entity that can be corrupted without the databaes server being
compromised. If everything in PGDATA was GCM-verified, it would be
secure, but because some parts are not, I don't think it would be.

>     > As specific examples, consider:
>     >
>     > An attack against the database system where the database server is shut
>     down,
>     > or a backup, and  the encryption key isn’t available on the system.
>     >
>     > The backup system itself, not running as the PG user (an option supported
>     by PG
>     > and at least pgbackrest) being compromised, thus allowing for injection
>     of
>     > changes into a backup or into a restore.
> 
>     I then question why we are not adding encryption to pg_basebackup or
>     pgbackrest rather than the database system.
> 
> Pgbackrest has encryption and authentication of it … but that doesn’t actually
> address the attack vector that I outlined. If the backup user is compromised
> then they can change the data before it gets to the storage.  If the backup
> user is compromised then they have access to whatever key is used to encrypt
> and authenticate the backup and therefore can trivially manipulate the data.

So the idea is that the backup user can be compromised without the data
being vulnerable --- makes sense, though that use-case seems narrow.

>     What were the _technical_ reasons for those objections?
> 
> I believe largely the ones I’m bringing up here and which I outline above… I
> don’t mean to pretend that any of this is of my own independent construction. I
> don’t believe it is and my apologies if it came across that way.

Yes, there is value beyond the check-box, but in most cases those
values are limited considering the complexity of the features, and the
check-box is what most people are asking for, I think.

>     > I’ve grown weary of this argument as the other major piece of work it was
>     > routinely applied to was RLS and yet that has certainly been seen broadly
>     as a
>     > beneficial feature with users clearly leveraging it and in more than some
>     > “checkbox” way.
> 
>     RLS has to overcome that objection, and I think it did, as was better
>     for doing that.
> 
> Beyond it being called a checkbox - what were the arguments against it?  I

The RLS arguments were that queries could expoose some of the underlying
data, but in summary, that was considered acceptable.

>     > We, as a community, are clearly losing value by lack of this capability,
>     if by
>     > no other measure than simply the numerous users of the commercial
>     > implementations feeling that they simply can’t use PG without this
>     feature, for
>     > whatever their reasoning.
> 
>     That is true, but I go back to my concern over useful feature vs. check
>     box.
> 
> While it’s easy to label something as checkbox, I don’t feel we have been fair

No, actually, it isn't.  I am not sure why you are saying that.

> to our users in doing so as it has historically prevented features which our
> users are demanding and end up getting from commercial providers until we
> implement them ultimately anyway.  This particular argument simply doesn’t seem
> to actually hold the value that proponents of it claim, for us at least, and we
> have clear counter-examples which we can point to and I hope we learn from
> those.

I don't think you are addressing actual issues above.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: Moving forward with TDE

From

Stephen Frost

Date:

28 March 2023, 00:03:50

Greetings,

On Mon, Mar 27, 2023 at 19:19 Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Mar 28, 2023 at 12:57:42AM +0200, Stephen Frost wrote:
> I consider the operating system and its processes as much more of a
> single entity than TLS over a network.
>
> This may be the case sometimes but there’s absolutely no shortage of other
> cases and it’s almost more the rule these days, that there is some kind of
> network between the OS processes and the storage- a SAN, an iSCSI network, NFS,
> are all quite common.

Yes, but consider that the database cluster is having to get its data
from that remote storage --- the remote storage is not an independent
entity that can be corrupted without the databaes server being
compromised. If everything in PGDATA was GCM-verified, it would be
secure, but because some parts are not, I don't think it would be.

The remote storage is certainly an independent system. Multi-mount LUNs are entirely possible in a SAN (and absolutely with NFS, or just the NFS server itself is compromised..), so while the attacker may not have any access to the database server itself, they may have access to these other systems, and that’s not even considering in-transit attacks which are also absolutely possible, especially with iSCSI or NFS.

I don’t understand what is being claimed that the remote storage is “not an independent system” based on my understanding of, eg, NFS. With NFS, a directory on the NFS server is exported and the client mounts that directory as NFS locally, all over a network which may or may not be secured against manipulation. A user on the NFS server with root access is absolutely able to access and modify files on the NFS server trivially, even if they have no access to the PG server. Would you explain what you mean?

I do agree that the ideal case would be that we encrypt everything we can (not everything can be for various reasons, but we don’t actually need to either) in the PGDATA directory is encrypted and authenticated, just like it would be ideal if everything was checksum’d and isn’t today. We are progressing in that direction thanks to efforts such as reworking the other subsystems to used shared buffers and a consistent page format, but just like with checksums we do not need to have the perfect solution for us to provide a lot of value here- and our users know that as the same is true of the unauthenticated encryption approaches being offered by the commercial solutions.

> > As specific examples, consider:
> >
> > An attack against the database system where the database server is shut
> down,
> > or a backup, and the encryption key isn’t available on the system.
> >
> > The backup system itself, not running as the PG user (an option supported
> by PG
> > and at least pgbackrest) being compromised, thus allowing for injection
> of
> > changes into a backup or into a restore.
>
> I then question why we are not adding encryption to pg_basebackup or
> pgbackrest rather than the database system.
>
> Pgbackrest has encryption and authentication of it … but that doesn’t actually
> address the attack vector that I outlined. If the backup user is compromised
> then they can change the data before it gets to the storage. If the backup
> user is compromised then they have access to whatever key is used to encrypt
> and authenticate the backup and therefore can trivially manipulate the data.

So the idea is that the backup user can be compromised without the data
being vulnerable --- makes sense, though that use-case seems narrow.

That’s perhaps a fair consideration- but it’s clearly of enough value that many of our users are asking for it and not using PG because we don’t have it today. Ultimately though, this clearly makes it more than a “checkbox” feature. I hope we are able to agree on that now.

> What were the _technical_ reasons for those objections?
>
> I believe largely the ones I’m bringing up here and which I outline above… I
> don’t mean to pretend that any of this is of my own independent construction. I
> don’t believe it is and my apologies if it came across that way.

Yes, there is value beyond the check-box, but in most cases those
values are limited considering the complexity of the features, and the
check-box is what most people are asking for, I think.

For the users who ask on the lists for this feature, regularly, how many don’t ask because they google or find prior responses on the list to the question of if we have this capability? How do we know that their cases are “checkbox”? Consider that there are standards groups which explicitly consider these attack vectors and consider them important enough to require mitigations to address those vectors. Do the end users of PG understand the attack vectors or why they matter? Perhaps not, but just because they can’t articulate the reasoning does NOT mean that the attack vector doesn’t exist or that their environment is somehow immune to it- indeed, as the standards bodies surely know, the opposite is true- they’re almost certainly at risk of those attack vectors and therefore the standards bodies are absolutely justified in requiring them to provide a solution. Treating these users as unimportant because they don’t have the depth of understanding that we do or that the standards body does is not helping them- it’s actively driving them away from PG.

> > I’ve grown weary of this argument as the other major piece of work it was
> > routinely applied to was RLS and yet that has certainly been seen broadly
> as a
> > beneficial feature with users clearly leveraging it and in more than some
> > “checkbox” way.
>
> RLS has to overcome that objection, and I think it did, as was better
> for doing that.
>
> Beyond it being called a checkbox - what were the arguments against it? I

The RLS arguments were that queries could expoose some of the underlying
data, but in summary, that was considered acceptable.

This is an excellent point- and dovetails very nicely into my argument that protecting primary data (what is provided by users and ends up in indexes and heaps) is valuable even if we don’t (yet..) have protection for other parts of the system. Reducing the size of the attack vector is absolutely useful, especially when it’s such a large amount of the data in the system. Yes, we should, and will, continue to improve- as we do with many features, but we don’t need to wait for perfection to include this feature, just as with RLS and numerous other features we have.

> > We, as a community, are clearly losing value by lack of this capability,
> if by
> > no other measure than simply the numerous users of the commercial
> > implementations feeling that they simply can’t use PG without this
> feature, for
> > whatever their reasoning.
>
> That is true, but I go back to my concern over useful feature vs. check
> box.
>
> While it’s easy to label something as checkbox, I don’t feel we have been fair

No, actually, it isn't. I am not sure why you are saying that.

I’m confused as to what is required to label a feature as a “checkbox” feature then. What did you us to make that determination of this feature? I’m happy to be wrong here.

> to our users in doing so as it has historically prevented features which our
> users are demanding and end up getting from commercial providers until we
> implement them ultimately anyway. This particular argument simply doesn’t seem
> to actually hold the value that proponents of it claim, for us at least, and we
> have clear counter-examples which we can point to and I hope we learn from
> those.

I don't think you are addressing actual issues above.

Specifics would be really helpful. I don’t doubt that there are things I’m missing, but I’ve tried to address each point raised clearly and concisely.

Thanks!

Stephen

Re: Moving forward with TDE

From

Bruce Momjian

Date:

28 March 2023, 01:35:38

On Tue, Mar 28, 2023 at 02:03:50AM +0200, Stephen Frost wrote:
> The remote storage is certainly an independent system. Multi-mount LUNs are
> entirely possible in a SAN (and absolutely with NFS, or just the NFS server
> itself is compromised..), so while the attacker may not have any access to the
> database server itself, they may have access to these other systems, and that’s
> not even considering in-transit attacks which are also absolutely possible,
> especially with iSCSI or NFS. 
> 
> I don’t understand what is being claimed that the remote storage is “not an
> independent system” based on my understanding of, eg, NFS. With NFS, a
> directory on the NFS server is exported and the client mounts that directory as
> NFS locally, all over a network which may or may not be secured against
> manipulation.  A user on the NFS server with root access is absolutely able to
> access and modify files on the NFS server trivially, even if they have no
> access to the PG server.  Would you explain what you mean?

The point is that someone could change values in the storage, pg_xact,
encryption settings, binaries, that would allow the attacker to learn
the encryption key.  This is not possible for two secure endpoints and
someone changing data in transit.  Yeah, it took me a while to
understand these boundaries too.

>     So the idea is that the backup user can be compromised without the data
>     being vulnerable --- makes sense, though that use-case seems narrow.
> 
> That’s perhaps a fair consideration- but it’s clearly of enough value that many
> of our users are asking for it and not using PG because we don’t have it today.
> Ultimately though, this clearly makes it more than a “checkbox” feature. I hope
> we are able to agree on that now.

It is more than a check box feature, yes, but I am guessing few people
are wanting the this for the actual features beyond check box.

>     Yes, there is value beyond the check-box, but in most cases those
>     values are limited considering the complexity of the features, and the
>     check-box is what most people are asking for, I think.
> 
> For the users who ask on the lists for this feature, regularly, how many don’t
> ask because they google or find prior responses on the list to the question of
> if we have this capability?  How do we know that their cases are “checkbox”? 

Because I have rarely heard people articulate the value beyond check
box.

> Consider that there are standards groups which explicitly consider these attack
> vectors and consider them important enough to require mitigations to address
> those vectors. Do the end users of PG understand the attack vectors or why they
> matter?  Perhaps not, but just because they can’t articulate the reasoning does
> NOT mean that the attack vector doesn’t exist or that their environment is
> somehow immune to it- indeed, as the standards bodies surely know, the opposite
> is true- they’re almost certainly at risk of those attack vectors and therefore
> the standards bodies are absolutely justified in requiring them to provide a
> solution. Treating these users as unimportant because they don’t have the depth
> of understanding that we do or that the standards body does is not helping
> them- it’s actively driving them away from PG. 

Well, then who is going to explain them here, because I have not heard
them yet.

>     The RLS arguments were that queries could expoose some of the underlying
>     data, but in summary, that was considered acceptable.
> 
> This is an excellent point- and dovetails very nicely into my argument that
> protecting primary data (what is provided by users and ends up in indexes and
> heaps) is valuable even if we don’t (yet..) have protection for other parts of
> the system. Reducing the size of the attack vector is absolutely useful,
> especially when it’s such a large amount of the data in the system. Yes, we
> should, and will, continue to improve- as we do with many features, but we
> don’t need to wait for perfection to include this feature, just as with RLS and
> numerous other features we have. 

The issue is that you needed a certain type of user with a certain type
of access to break RLS, while for this, writing to PGDATA is the simple
case for all the breakage, and the thing we are protecting with
authentication.

>     >     > We, as a community, are clearly losing value by lack of this
>     capability,
>     >     if by
>     >     > no other measure than simply the numerous users of the commercial
>     >     > implementations feeling that they simply can’t use PG without this
>     >     feature, for
>     >     > whatever their reasoning.
>     >
>     >     That is true, but I go back to my concern over useful feature vs.
>     check
>     >     box.
>     >
>     > While it’s easy to label something as checkbox, I don’t feel we have been
>     fair
> 
>     No, actually, it isn't.  I am not sure why you are saying that.
> 
> I’m confused as to what is required to label a feature as a “checkbox” feature
> then. What did you us to make that determination of this feature?  I’m happy to
> be wrong here. 

I don't see the point in me continuing to reply here.  You just seem to
continue asking questions without actually thinking of what I am saying,
and hope I get tired or something.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: Moving forward with TDE

From

Stephen Frost

Date:

28 March 2023, 02:56:58

Greetings,

On Mon, Mar 27, 2023 at 21:35 Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Mar 28, 2023 at 02:03:50AM +0200, Stephen Frost wrote:
> The remote storage is certainly an independent system. Multi-mount LUNs are
> entirely possible in a SAN (and absolutely with NFS, or just the NFS server
> itself is compromised..), so while the attacker may not have any access to the
> database server itself, they may have access to these other systems, and that’s
> not even considering in-transit attacks which are also absolutely possible,
> especially with iSCSI or NFS.
>
> I don’t understand what is being claimed that the remote storage is “not an
> independent system” based on my understanding of, eg, NFS. With NFS, a
> directory on the NFS server is exported and the client mounts that directory as
> NFS locally, all over a network which may or may not be secured against
> manipulation. A user on the NFS server with root access is absolutely able to
> access and modify files on the NFS server trivially, even if they have no
> access to the PG server. Would you explain what you mean?

The point is that someone could change values in the storage, pg_xact,
encryption settings, binaries, that would allow the attacker to learn
the encryption key. This is not possible for two secure endpoints and
someone changing data in transit. Yeah, it took me a while to
understand these boundaries too.

This depends on the specific configuration of the systems, clearly. Being able to change values in other parts of the system isn’t great and we should work to improve on that, but clearly that isn’t so much of an issue that people aren’t willing to accept a partial solution or existing commercial solutions wouldn’t be accepted or considered viable. Indeed, using GCM is objectively an improvement over what’s being offered commonly today.

I also generally object to the idea that being able to manipulate the PGDATA directory necessarily means being able to gain access to the KEK. In trivial solutions, sure, it’s possible, but the NFS server should never be asking some external KMS for the key to a given DB server and a reasonable implementation won’t allow this, and instead would flag and log such an attempt for someone to review, leading to a much faster realization of a compromised system.

Certainly it’s much simpler to reason about an attacker with no knowledge of either system and only network access to see if they can penetrate the communications between the two end-points, but that is not the only case where authenticated encryption is useful.

> So the idea is that the backup user can be compromised without the data
> being vulnerable --- makes sense, though that use-case seems narrow.
>
> That’s perhaps a fair consideration- but it’s clearly of enough value that many
> of our users are asking for it and not using PG because we don’t have it today.
> Ultimately though, this clearly makes it more than a “checkbox” feature. I hope
> we are able to agree on that now.

It is more than a check box feature, yes, but I am guessing few people
are wanting the this for the actual features beyond check box.

As I explained previously, perhaps the people asking are doing so for only the “checkbox”, but that doesn’t mean it isn’t a useful feature or that it isn’t valuable in its own right. Those checklists were compiled and enforced for a reason, which the end users might not understand but is still absolutely valuable. Sad to say, but frankly this is becoming more and more common but we shouldn’t be faulting the users asking for it- if it were truly useless then eventually it would be removed from the standard, but it hasn’t and it won’t be because, while not every end user has a depth of understanding to explain it, it is actually a useful and important capability to have and one that is important to implement.

> Yes, there is value beyond the check-box, but in most cases those
> values are limited considering the complexity of the features, and the
> check-box is what most people are asking for, I think.
>
> For the users who ask on the lists for this feature, regularly, how many don’t
> ask because they google or find prior responses on the list to the question of
> if we have this capability? How do we know that their cases are “checkbox”?

Because I have rarely heard people articulate the value beyond check
box.

Have I done so sufficiently then that we can agree that calling it “checkbox” is inappropriate and detrimental to our user base?

> Consider that there are standards groups which explicitly consider these attack
> vectors and consider them important enough to require mitigations to address
> those vectors. Do the end users of PG understand the attack vectors or why they
> matter? Perhaps not, but just because they can’t articulate the reasoning does
> NOT mean that the attack vector doesn’t exist or that their environment is
> somehow immune to it- indeed, as the standards bodies surely know, the opposite
> is true- they’re almost certainly at risk of those attack vectors and therefore
> the standards bodies are absolutely justified in requiring them to provide a
> solution. Treating these users as unimportant because they don’t have the depth
> of understanding that we do or that the standards body does is not helping
> them- it’s actively driving them away from PG.

Well, then who is going to explain them here, because I have not heard
them yet.

I thought I was doing so.

> The RLS arguments were that queries could expoose some of the underlying
> data, but in summary, that was considered acceptable.
>
> This is an excellent point- and dovetails very nicely into my argument that
> protecting primary data (what is provided by users and ends up in indexes and
> heaps) is valuable even if we don’t (yet..) have protection for other parts of
> the system. Reducing the size of the attack vector is absolutely useful,
> especially when it’s such a large amount of the data in the system. Yes, we
> should, and will, continue to improve- as we do with many features, but we
> don’t need to wait for perfection to include this feature, just as with RLS and
> numerous other features we have.

The issue is that you needed a certain type of user with a certain type
of access to break RLS, while for this, writing to PGDATA is the simple
case for all the breakage, and the thing we are protecting with
authentication.

This goes back to the “if it isn’t perfect then it’s useless” argument … but that’s exactly the discussion which was had around RLS and ultimately we decided that RLS was still useful even with the leaks- and our users accepted that also and have benefitted from it ever since it was included in core. The same exists here- yes, more needs to be done than the absolute simplest “make install” to have the system be secure (not unlike today with our defaults from a source build with “make install”..) but at least with this capability included it’s possible, and we can write “securing PostgreSQL” documentation on how to, whereas without it there is simply no way to address the attack vectors I’ve articulated here.

> > > We, as a community, are clearly losing value by lack of this
> capability,
> > if by
> > > no other measure than simply the numerous users of the commercial
> > > implementations feeling that they simply can’t use PG without this
> > feature, for
> > > whatever their reasoning.
> >
> > That is true, but I go back to my concern over useful feature vs.
> check
> > box.
> >
> > While it’s easy to label something as checkbox, I don’t feel we have been
> fair
>
> No, actually, it isn't. I am not sure why you are saying that.
>
> I’m confused as to what is required to label a feature as a “checkbox” feature
> then. What did you us to make that determination of this feature? I’m happy to
> be wrong here.

I don't see the point in me continuing to reply here. You just seem to
continue asking questions without actually thinking of what I am saying,
and hope I get tired or something.

I hope we have others who have a moment to chime in here and provide their viewpoints as I don’t feel this is an accurate representation of the discussion thus far.

Thanks,

Stephen

Re: Moving forward with TDE

From

Chris Travers

Date:

28 March 2023, 07:28:40

On Tue, Mar 28, 2023 at 5:02 AM Stephen Frost <sfrost@snowman.net> wrote:

> There's clearly user demand for it as there's a number of organizations
> who have forks which are providing it in one shape or another. This
> kind of splintering of the community is actually an actively bad thing
> for the project and is part of what killed Unix, by at least some pretty
> reputable accounts, in my view.

Yes, the number of commercial implementations of this is a concern. Of
course, it is also possible that those commercial implementations are
meeting checkbox requirements rather than technical ones, and the
community has been hostile to check box-only features.

I’ve grown weary of this argument as the other major piece of work it was routinely applied to was RLS and yet that has certainly been seen broadly as a beneficial feature with users clearly leveraging it and in more than some “checkbox” way.

Indeed, it’s similar also in that commercial implementations were done of RLS while there were arguments made about it being a checkbox feature which were used to discourage it from being implemented in core. Were it truly checkbox, I don’t feel we would have the regular and ongoing discussion about it on the lists that we do, nor see other tools built on top of PG which specifically leverage it. Perhaps there are truly checkbox features out there which we will never implement, but I’m (perhaps due to what my dad would call selective listening on my part, perhaps not) having trouble coming up with any presently. Features that exist in other systems that we don’t want? Certainly. We don’t characterize those as simply “checkbox” though. Perhaps that’s in part because we provide alternatives- but that’s not the case here. We have no comparable way to have this capability as part of the core system.

We, as a community, are clearly losing value by lack of this capability, if by no other measure than simply the numerous users of the commercial implementations feeling that they simply can’t use PG without this feature, for whatever their reasoning.

I also think this is something of a problem because very few requirements are actually purely technical requirements, and I think the issue is that in many cases there are ways around the lack of the feature.

So I would phrase this differently. What is the value of doing this in core?

This dramatically simplifies the question of setting up a PostgreSQL environment that is properly protected with encryption at rest. That in itself is valuable. Today you can accomplish something similar with encrypted filesystems and encryption options in things like pgbackrest. However these are many different pieces of a solution and missing up the setup of any one of them can compromise the data. Having a single point of encryption and decryption means fewer opportunities to mess it up and that means less risk. This in turn makes it easier to settle on using PostgreSQL.

There are certainly going to be those who approach encryption at rest as a checkbox item and who don't really care if there are holes in it. But there are others who really should be concerned (and this is becoming a bigger issue where data privacy, PCI-DSS, and other requirements may come into play), and those need better tooling than we have. I also think that as data privacy becomes a larger issue, this will become a larger topic.

Anyway, my contribution to that question.

Best Wishes,
Chris Travers

Thanks,

Stephen

Best Wishes,

Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.

http://www.efficito.com/learn_more

Re: Moving forward with TDE

From

Chris Travers

Date:

28 March 2023, 07:48:37

On Tue, Mar 28, 2023 at 8:35 AM Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Mar 28, 2023 at 02:03:50AM +0200, Stephen Frost wrote:
> The remote storage is certainly an independent system. Multi-mount LUNs are
> entirely possible in a SAN (and absolutely with NFS, or just the NFS server
> itself is compromised..), so while the attacker may not have any access to the
> database server itself, they may have access to these other systems, and that’s
> not even considering in-transit attacks which are also absolutely possible,
> especially with iSCSI or NFS.
>
> I don’t understand what is being claimed that the remote storage is “not an
> independent system” based on my understanding of, eg, NFS. With NFS, a
> directory on the NFS server is exported and the client mounts that directory as
> NFS locally, all over a network which may or may not be secured against
> manipulation. A user on the NFS server with root access is absolutely able to
> access and modify files on the NFS server trivially, even if they have no
> access to the PG server. Would you explain what you mean?

The point is that someone could change values in the storage, pg_xact,
encryption settings, binaries, that would allow the attacker to learn
the encryption key. This is not possible for two secure endpoints and
someone changing data in transit. Yeah, it took me a while to
understand these boundaries too.

> So the idea is that the backup user can be compromised without the data
> being vulnerable --- makes sense, though that use-case seems narrow.
>
> That’s perhaps a fair consideration- but it’s clearly of enough value that many
> of our users are asking for it and not using PG because we don’t have it today.
> Ultimately though, this clearly makes it more than a “checkbox” feature. I hope
> we are able to agree on that now.

It is more than a check box feature, yes, but I am guessing few people
are wanting the this for the actual features beyond check box.

> Yes, there is value beyond the check-box, but in most cases those
> values are limited considering the complexity of the features, and the
> check-box is what most people are asking for, I think.
>
> For the users who ask on the lists for this feature, regularly, how many don’t
> ask because they google or find prior responses on the list to the question of
> if we have this capability? How do we know that their cases are “checkbox”?

Because I have rarely heard people articulate the value beyond check
box.

I think there is value. I am going to try to articulate a case for this here.

The first is that if people just want a "checkbox" then they can implement PostgreSQL in ways that have encryption at rest today. This includes using LUKS and the encryption options in pgbackrest. That's good enough for a checkbox. It isn't good enough for a real, secured instance however.

There are a few problems with trying to do this for a secured instance. The first is that you have multiple links in the encryption chain, and the failure of any one of them ill lead to cleartext exposure of data files. This is not a problem for those who just want to tick a checkbox. Also the fact that backups and main systems are separately encrypted there (if the backups are encrypted at all) means that people have to choose between complicating a restore process and simply ditching encryption on the backup, which makes the checkbox somewhat pointless.

Where I have usually seen this come up is in the question of "how do you prevent the problem of someone pulling storage devices from your servers and taking them away to compromise your data?" Physical security comes into it but often times people want more than that as an answer. I saw questions like that from external auditors when I was at Adjust.

If you want to actually address that problem, then the current tooling is quite cumbersome. Yes you can do it, but it is very hard to make sure it has been fully secured and also very hard to monitor. TDE would make the setup and verification of this much easier. And in particular it solves a number of other issues that I can see arising from LUKS and similar approaches since it doesn't rely on the kernel to be able to translate plain text to and from cypher text.

I have actually worked with folks who have PII and need to protect it and who currently use LUKS and pg_backrest to do so. I would be extremely happy to see TDE replace those for their needs. I can imagine that those who hold high value data would use it as well instead of these other more error prone and less secure setups.

> Consider that there are standards groups which explicitly consider these attack
> vectors and consider them important enough to require mitigations to address
> those vectors. Do the end users of PG understand the attack vectors or why they
> matter? Perhaps not, but just because they can’t articulate the reasoning does
> NOT mean that the attack vector doesn’t exist or that their environment is
> somehow immune to it- indeed, as the standards bodies surely know, the opposite
> is true- they’re almost certainly at risk of those attack vectors and therefore
> the standards bodies are absolutely justified in requiring them to provide a
> solution. Treating these users as unimportant because they don’t have the depth
> of understanding that we do or that the standards body does is not helping
> them- it’s actively driving them away from PG.

Well, then who is going to explain them here, because I have not heard
them yet.

> The RLS arguments were that queries could expoose some of the underlying
> data, but in summary, that was considered acceptable.
>
> This is an excellent point- and dovetails very nicely into my argument that
> protecting primary data (what is provided by users and ends up in indexes and
> heaps) is valuable even if we don’t (yet..) have protection for other parts of
> the system. Reducing the size of the attack vector is absolutely useful,
> especially when it’s such a large amount of the data in the system. Yes, we
> should, and will, continue to improve- as we do with many features, but we
> don’t need to wait for perfection to include this feature, just as with RLS and
> numerous other features we have.

The issue is that you needed a certain type of user with a certain type
of access to break RLS, while for this, writing to PGDATA is the simple
case for all the breakage, and the thing we are protecting with
authentication.

> > > We, as a community, are clearly losing value by lack of this
> capability,
> > if by
> > > no other measure than simply the numerous users of the commercial
> > > implementations feeling that they simply can’t use PG without this
> > feature, for
> > > whatever their reasoning.
> >
> > That is true, but I go back to my concern over useful feature vs.
> check
> > box.
> >
> > While it’s easy to label something as checkbox, I don’t feel we have been
> fair
>
> No, actually, it isn't. I am not sure why you are saying that.
>
> I’m confused as to what is required to label a feature as a “checkbox” feature
> then. What did you us to make that determination of this feature? I’m happy to
> be wrong here.

I don't see the point in me continuing to reply here. You just seem to
continue asking questions without actually thinking of what I am saying,
and hope I get tired or something.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

Best Wishes,

Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.