Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | 20190613150725.2xmdaywxjf3empwf@momjian.us Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
List | pgsql-hackers |
On Thu, Jun 13, 2019 at 04:26:47PM +0900, Masahiko Sawada wrote: > On Thu, Jun 13, 2019 at 3:48 AM Bruce Momjian <bruce@momjian.us> wrote: > > The big question is how many people will be mixing encrypted and > > unencrypted data in the same cluster, and care about performance? Just > > because someone might care is not enough of a justification. They can > > certainly create separate encrypted and non-encrypted clusters. Can we > > implement level 6 and then implement levels 3-5 later if desired? > > I guess most users are interested in performance. Users don't want to > sacrifice performance for security and vice versa. Fine grained > control would allow us to seek a compromise point. Well, what does that add to the argument? Yes, everyone cares about performance, but it is the magnitude of the performance impact vs. the complexity that is the issue here. Also, by definition, users will trade performance for security because encrypting data will slow down the database. The open question is how much, and if that overhead is reasonable based on the complexity. What I don't want to do is to design a system that is more complex than required, and it might become so complex we might never get it done. > > How would you configure the WAL to know which key to use if we did #5? > > Wouldn't system tables and statistics, and perhaps referential integry > > allow for information leakage? > > We use a something like a map between tablespace oid and encryption > key as a separate file (maybe stored in $PGDATA/global), called > keyring. Using the keyring we can obtain encryption key by tablespace > oid. For WAL, we add a flag to XLogRecord which indicates whether the > WAL record is encrypted, and we already have relfilenode in the header > data of WAL. So we can obtain the tablespace oid from the part and > obtain the corresponding encryption key. OK. > > > 2. Encryption Objects. > > > Indexes, WAL and TOAST table pertaining to encrypted tables, and > > > temporary files must also be encrypted but we need to discuss whether > > > we encrypt non-user data as well such as SLRU data, vm and fsm, and > > > perhaps even other files such as 2PC state files, backend_label etc. > > > Encryption everything is required by some use case but it's also true > > > that there are users who wish to encrypt database while minimizing > > > performance overheads. > > > > I don't think we need to encrypt the "status" files like SLRU data, vm > > and fsm. > > I agree. Good. > > Good point about pg_waldump. I am a little worried we might open a > > security hole making a new API so they work, so maybe we should avoid > > it. > > Yeah, in principle since data key of 2 tier key architecture should > not go outside database I think we should not tell data keys to > utility commands. So the rearranging WAL format seems to be a better > solution but is there any reason why the main data is placed at end of > WAL record? I wonder if we can assemble WAL records as following order > and encrypt only 3 and 4. > > 1. Header data (XLogRecord and other headers) > 2. Main data (xl_heap_insert, xl_heap_update etc + related data) > 3. Block data (Tuple data, FPI) > 4. Sub data (e.g tuple data for logical decoding) Yes, that does sound like a reasonable idea. It is similar to us not encrypting the clog --- there is little value. However, if we only encrypt the cluster, we don't need to expose the relfilenode and we can just encrypt the entire WAL --- I like that simplicity. We might find that the complexity of encrypting only certain tablespaces makes the system slower than just encrypting the entire cluster. > > > Also, for system catalog encryption, it could be a hard part. System > > > catalogs are initially created at initdb time and created by copying > > > from template1 when CREATE DATABASE. Therefore we would need to either > > > modify initdb so that it's aware of encryption keys and KMS or modify > > > database creation so that it copies database file while encrypting > > > them. > > > > I assume initdb will use the same API that you would use to start the > > server itself, e.g., type in a password, or contact a key server. > > I realized that in XTS encryption mode since we craft the tweak using > relfilenode we will need to have the different tweaks for system > catalogs in new database would change. So we might need to re-encrypt > system catalogs when CREATE DATABASE after all. I suspect that even > the cluster-wide encryption has the same problem. Yes, this is why I want to just do cluster-wide encryption at this stage. In addition, while the 8k blocks would use a block cipher, the WAL would likely use a stream cipher, and it will be very hard to use multiple stream ciphers in a single WAL file. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
pgsql-hackers by date: