Re: Moving forward with TDE - Mailing list pgsql-hackers

From Chris Travers
Subject Re: Moving forward with TDE
Date
Msg-id CAKt_Zfuh3GWhXBb-mTkp6Cdg56GhZENtKAQXm-z2cnHJUbu1xg@mail.gmail.com
Whole thread Raw
In response to Re: Moving forward with TDE  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers


On Tue, Mar 28, 2023 at 8:35 AM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Mar 28, 2023 at 02:03:50AM +0200, Stephen Frost wrote:
> The remote storage is certainly an independent system. Multi-mount LUNs are
> entirely possible in a SAN (and absolutely with NFS, or just the NFS server
> itself is compromised..), so while the attacker may not have any access to the
> database server itself, they may have access to these other systems, and that’s
> not even considering in-transit attacks which are also absolutely possible,
> especially with iSCSI or NFS. 
>
> I don’t understand what is being claimed that the remote storage is “not an
> independent system” based on my understanding of, eg, NFS. With NFS, a
> directory on the NFS server is exported and the client mounts that directory as
> NFS locally, all over a network which may or may not be secured against
> manipulation.  A user on the NFS server with root access is absolutely able to
> access and modify files on the NFS server trivially, even if they have no
> access to the PG server.  Would you explain what you mean?

The point is that someone could change values in the storage, pg_xact,
encryption settings, binaries, that would allow the attacker to learn
the encryption key.  This is not possible for two secure endpoints and
someone changing data in transit.  Yeah, it took me a while to
understand these boundaries too.

>     So the idea is that the backup user can be compromised without the data
>     being vulnerable --- makes sense, though that use-case seems narrow.
>
> That’s perhaps a fair consideration- but it’s clearly of enough value that many
> of our users are asking for it and not using PG because we don’t have it today.
> Ultimately though, this clearly makes it more than a “checkbox” feature. I hope
> we are able to agree on that now.

It is more than a check box feature, yes, but I am guessing few people
are wanting the this for the actual features beyond check box.

>     Yes, there is value beyond the check-box, but in most cases those
>     values are limited considering the complexity of the features, and the
>     check-box is what most people are asking for, I think.
>
> For the users who ask on the lists for this feature, regularly, how many don’t
> ask because they google or find prior responses on the list to the question of
> if we have this capability?  How do we know that their cases are “checkbox”? 

Because I have rarely heard people articulate the value beyond check
box.

I think there is value.  I am going to try to articulate a case for this here.

The first is that if people just want a "checkbox" then they can implement PostgreSQL in ways that have encryption at rest today.  This includes using LUKS and the encryption options in pgbackrest.  That's good enough for a checkbox.  It isn't good enough for a real, secured instance however.

There are a few problems with trying to do this for a secured instance.  The first is that you have multiple links in the encryption chain, and the failure of any one of them ill lead to cleartext exposure of data files.  This is not a problem for those who just want to tick a checkbox.  Also the fact that backups and main systems are separately encrypted there (if the backups are encrypted at all) means that people have to choose between complicating a restore process and simply ditching encryption on the backup, which makes the checkbox somewhat pointless.

Where I have usually seen this come up is in the question of "how do you prevent the problem of someone pulling storage devices from your servers and taking them away to compromise your data?"  Physical security comes into it but often times people want more than that as an answer.  I saw questions like that from external auditors when I was at Adjust.

If you want to actually address that problem, then the current tooling is quite cumbersome.  Yes you can do it, but it is very hard to make sure it has been fully secured and also very hard to monitor.  TDE would make the setup and verification of this much easier.  And in particular it solves a number of other issues that I can see arising from LUKS and similar approaches since it doesn't rely on the kernel to be able to translate plain text to and from cypher text.

I have actually worked with folks who have PII and need to protect it and who currently use LUKS and pg_backrest to do so.  I would be extremely happy to see TDE replace those for their needs.  I can imagine that those who hold high value data would use it as well instead of these other more error prone and less secure setups.
 

> Consider that there are standards groups which explicitly consider these attack
> vectors and consider them important enough to require mitigations to address
> those vectors. Do the end users of PG understand the attack vectors or why they
> matter?  Perhaps not, but just because they can’t articulate the reasoning does
> NOT mean that the attack vector doesn’t exist or that their environment is
> somehow immune to it- indeed, as the standards bodies surely know, the opposite
> is true- they’re almost certainly at risk of those attack vectors and therefore
> the standards bodies are absolutely justified in requiring them to provide a
> solution. Treating these users as unimportant because they don’t have the depth
> of understanding that we do or that the standards body does is not helping
> them- it’s actively driving them away from PG. 

Well, then who is going to explain them here, because I have not heard
them yet.

>     The RLS arguments were that queries could expoose some of the underlying
>     data, but in summary, that was considered acceptable.
>
> This is an excellent point- and dovetails very nicely into my argument that
> protecting primary data (what is provided by users and ends up in indexes and
> heaps) is valuable even if we don’t (yet..) have protection for other parts of
> the system. Reducing the size of the attack vector is absolutely useful,
> especially when it’s such a large amount of the data in the system. Yes, we
> should, and will, continue to improve- as we do with many features, but we
> don’t need to wait for perfection to include this feature, just as with RLS and
> numerous other features we have. 

The issue is that you needed a certain type of user with a certain type
of access to break RLS, while for this, writing to PGDATA is the simple
case for all the breakage, and the thing we are protecting with
authentication.

>     >     > We, as a community, are clearly losing value by lack of this
>     capability,
>     >     if by
>     >     > no other measure than simply the numerous users of the commercial
>     >     > implementations feeling that they simply can’t use PG without this
>     >     feature, for
>     >     > whatever their reasoning.
>     >
>     >     That is true, but I go back to my concern over useful feature vs.
>     check
>     >     box.
>     >
>     > While it’s easy to label something as checkbox, I don’t feel we have been
>     fair
>
>     No, actually, it isn't.  I am not sure why you are saying that.
>
> I’m confused as to what is required to label a feature as a “checkbox” feature
> then. What did you us to make that determination of this feature?  I’m happy to
> be wrong here. 

I don't see the point in me continuing to reply here.  You just seem to
continue asking questions without actually thinking of what I am saying,
and hope I get tired or something.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.


--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.

pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: [EXTERNAL] Support load balancing in libpq
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Should vacuum process config file reload more often