Re: Transparent Data Encryption (TDE) and encrypted files - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Transparent Data Encryption (TDE) and encrypted files
Date
Msg-id 20191003172655.GD6962@tamriel.snowman.net
Whole thread Raw
In response to Re: Transparent Data Encryption (TDE) and encrypted files  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Transparent Data Encryption (TDE) and encrypted files
List pgsql-hackers
Greetings,

* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
> On Thu, Oct 03, 2019 at 11:51:41AM -0400, Stephen Frost wrote:
> >* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
> >>On Thu, Oct 03, 2019 at 10:40:40AM -0400, Stephen Frost wrote:
> >>>People who are looking for 'encrypt all the things' should and will be
> >>>looking at filesytem-level encryption options.  That's not what this
> >>>feature is about.
> >>
> >>That's almost certainly not true, at least not universally.
> >>
> >>It may be true for some people, but a a lot of the people asking for
> >>in-database encryption essentially want to do filesystem encryption but
> >>can't use it for various reasons. E.g. because they're running in
> >>environments that make filesystem encryption impossible to use (OS not
> >>supporting it directly, no access to the block device, lack of admin
> >>privileges, ...). Or maybe they worry about people with fs access.
> >
> >Anyone coming from other database systems isn't asking for that though
> >and it wouldn't be a comparable offering to other systems.
>
> I don't think that's quite accurate. In the previous message you claimed
> (1) this isn't what other database systems do and (2) people who want to
> encrypt everything should just use fs encryption, because that's not
> what TDE is about.
>
> Regarding (1), I'm pretty sure Oracle TDE does pretty much exactly this,
> at least in the mode with tablespace-level encryption. It's true there
> is also column-level mode, but from my experience it's far less used
> because it has a number of annoying limitations.

We're probably being too general and that's ending up with us talking
past each other.  Yes, Oracle provides tablespace and column level
encryption, but neither case results in *everything* being encrypted.

> So I'm somewhat puzzled by your claim that people coming from other
> systems are asking for the column-level mode. At least I'm assuming
> that's what they're asking for, because I don't see other options.

I've seen asks for tablespace, table, and column-level, but it's always
been about the actual data.  Something like clog is an entirely internal
structure that doesn't include the actual data.  Yes, it's possible it
could somehow be used for a side-channel attack, as could other things,
such as WAL, and as such I'm not sure that forcing a policy of "encrypt
everything" is actually a sensible approach and it definitely adds
complexity and makes it a lot more difficult to come up with a sensible
solution.

> >>If you look at how the two threads discussing the FDE design, both of
> >>them pretty much started as "let's do FDE in the database".
> >
> >And that's how some folks continue to see it- let's just encrypt all the
> >things, until they actually look at it and start thinking about what
> >that means and how to implement it.
>
> This argument also works the other way, though. On Oracle, people often
> start with the column-level encryption because it seems naturally
> superior (hey, I can encrypt just the columns I want, ...) and then they
> start running into the various limitations and eventually just switch to
> the tablespace-level encryption.
>
> Now, maybe we'll be able to solve those limitations - but I think it's
> pretty unlikely, because those limitations seem quite inherent to how
> encryption affects indexes etc.

It would probably be useful to discuss the specific limitations that
you've seen causes people to move away from column-level encryption.

I definitely agree that figuring out how to make things work with
indexes is a non-trivial challenge, though I'm hopeful that we can come
up with something sensible.

> >Yeah, it'd be great to just encrypt everything, with a bunch of
> >different keys, all of which are stored somewhere else, and can be
> >updated and changed by the user when they need to do a rekeying, but
> >then you start have to asking about what keys need to be available when
> >for doing crash recovery, how do you handle a crash in the middle of a
> >rekeying, how do you handle updating keys from the user, etc..
> >
> >Sure, we could offer a dead simple "here, use this one key at database
> >start to just encrypt everything" and that would be enough for some set
> >of users (a very small set, imv, but that's subjective, obviously), but
> >I don't think we could dare promote that as having TDE because it
> >wouldn't be at all comparable to what other databases have, and it
> >wouldn't materially move us in the direction of having real TDE.
>
> I think that very much depends on the definition of what "real TDE".  I
> don't know what exactly that means at this point. And as I said before,
> I think such simple mode *is* comparable to (at least some) solutions
> available in other databases (as explained above).

When I was researching this, I couldn't find any example of a database
that wouldn't start without the one magic key that encrypts everything.
I'm happy to be told that I was wrong in my understanding of that, with
some examples.

> As for the users, I don't have any objective data about this, but I
> think the amount of people wanting such simple solution is non-trivial.
> That does not mean we can't extend it to support more advanced features.

The concern that I raised before and that I continue to worry about is
that providing such a simple capability will have a lot of limitations
too (such as having a single key and only being able to rekey during a
complete downtime, because we have to re-encrypt clog, etc, etc), and
I don't see it helping us get to more granular TDE because, for that,
where we really need to start is by building a vault of some kind to
store the keys in and then figuring out how we do things like crash
recovery in a sensible way and, ideally, without needing to have access
to all of (any of?) the keys.

> >>>>I'm not sold on the comments that have been made about encrypting the
> >>>>server log. I agree that could leak data, but that seems like somebody
> >>>>else's problem: the log files aren't really under PostgreSQL's
> >>>>management in the same way as pg_clog is. If you want to secure your
> >>>>logs, send them to syslog and configure it to do whatever you need.
> >>>
> >>>I agree with this.
> >>
> >>I don't. I know it's not an easy problem to solve, but it may contain
> >>user data (which is what we manage). We may allow disabling that, at
> >>which point it becomes someone else's problem.
> >
> >We also send user data to clients, but I don't imagine we're suggesting
> >that we need to control what some downstream application does with that
> >data or how it gets stored.  There's definitely a lot of room for
> >improvement in our logging (in an ideal world, we'd have a way to
> >actually store the logs in the database, at which point it could be
> >encrypted or not that way...), but I'm not seeing the need for us to
> >have a way to encrypt the log files.  If we did encrypt them, we'd have
> >to make sure to do it in a way that users could still access them
> >without the database being up and running, which might be tricky if the
> >key is in the vault...
>
> That's a bit of a straw-man argument, really. The client is obviously
> meant to receive and handle sensitive data, that's it's main purpose.
> For logging systems the situation is a bit different, it's a general
> purpose tool, with no idea what the data is.

The argument you're making is that the log isn't intended to have
sensitive data, but while that might be a nice place to get to, we
certainly aren't there today, which means that people should really be
sending the logs to a location that's trusted.

> I do understand it's pretty pointless to send encrypted message to such
> external tools, but IMO it's be good to implement that at least for our
> internal logging collector.

It's also less than user friendly to log to encrypted files that you
can't read without having the database system being up, so we'd have to
figure out at least a solution to that problem, and then if you have
downstream systems where the logs are going to, you have to decrypt
them, or have a way to have them not be encrypted perhaps.

In general, wrt the logs, I feel like it's at least a reasonably small
and independent piece of this, though I wonder if it'll cause similar
problems when it comes to dealing with crash recovery (how do we log if
we don't have the key from the vault because we haven't done crash
recovery yet, for example...).

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Transparent Data Encryption (TDE) and encrypted files
Next
From: Rob
Date:
Subject: Fix for Bug #16032