Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | CAD21AoBjrbxvaMpTApX1cEsO=8N=nc2xVZPB0d9e-VjJ=YaRnw@mail.gmail.com Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS) |
List | pgsql-hackers |
On Mon, Jun 17, 2019 at 11:02 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Mon, Jun 17, 2019 at 08:39:27AM -0400, Joe Conway wrote: > >On 6/17/19 8:29 AM, Masahiko Sawada wrote: > >> From perspective of cryptographic, I think the fine grained TDE would > >> be better solution. Therefore if we eventually want the fine grained > >> TDE I wonder if it might be better to develop the table/tablespace TDE > >> first while keeping it simple as much as possible in v1, and then we > >> can provide the functionality to encrypt other data in database > >> cluster to satisfy the encrypting-everything requirement. I guess that > >> it's easier to incrementally add encryption target objects rather than > >> making it fine grained while not changing encryption target objects. > >> > >> FWIW I'm writing a draft patch of per tablespace TDE and will submit > >> it in this month. We can more discuss the complexity of the proposed > >> TDE using it. > > > >+1 > > > >Looking forward to it. > > > > Yep. In particular, I'm interested in those aspects: > Attached the draft version patch sets of per tablespace transparent data at rest encryption. The patch doesn't support full functionality, it includes: * Per tablespace encryption * Encryption and decryption buffer data when disk I/O. * 2 tier key hierarchy and key rotation * Temporary file encryption (based on the patch Antonin proposd) * System catalog encryption * Generic key management API and test module * Simple TAP test but doesn't include for now (I'm writing): * WAL encryption * Replication supports * pg_upgrade support * Documentation * README and doesn't support: * SLRU data encryption * other system file encryption (pg_twophase, pg_subtrans, backup_label etc) * Server log encryption Before explaining the detail of the patch let me share my thoughts on the following points. > (1) What's the proposed minimum viable product, and how do we expect to > extend it with the more elaborate features. I don't expect perfect > specification, but we should have some idea so that we don't paint > ourselves in the corner. I think the minimum viable product should support the following features. * Fine grained encryption object control (not using single key for whole database cluster). * Encrypt and decrypt tables (including system catalogs), indexes, TOAST tables, WAL and temporary files when disk I/O. * Passing either password, passphrase or encryption key to postgres server without the risk of being written to files. * Front-end programs provided by PostgreSQL source code work as much as possible. * Key rotation I think that the following features would be added. * SLRU and other data encryption. I think we can use an another encryption key for these data. * Support other encryption algorithms. I don't have any idea so far but it would be not hard to support other symmetric-key algorithm. * Faster key rotation. It can be done by having 2 tier key hierarchy. * Integrate with external key management services. The patch implements this but I'm sure there are other ways to integrate with external key management services. > > (2) How does it affect recovery, backups and replication (both physical > and logical)? That is, which other parts need to know the encryption keys > to function properly? If we encrypt whole 8kB WAL block (in cluster-wide encryption case) it would be not hard because we just encrypt before writing to the disk with single key. On the other hand if we encrypt some WAL records it could be hard; it requires changes around WAL assembly code so that it can obtain encryption keys and encrypt WAL data before inserting to the WAL buffer. Since WAL is encrypted the recovery needs to obtain all encryption keys and decrypt the encrypted WAL. For streaming replication, since basically wal senders don't need to know the actual contents of WAL (although xlogreader need to know WAL header for validation) they send WAL data in encrypted state. And wal receiver decrypt them. Therefore encryption keys also must be replicated. On the other hand, logical replication (and logical decoding) needs to decrypt WAL data when decoding. Since the logical decoding is performed in PostgreSQL server side it's not hard to obtain all encryption keys. It can send change sets both in unencrypted state and even in encrypted sate if encrypt them again. We would change xlogreader code so that it can decrypt WAL. So I think that logical replication will be able to get WAL data in unencrypted state without special operation. For backups, physical backup must be encrypted even if we get it by pg_basebackup, otherwise we cannot protect data from a malicious backup operator threats. And encryption keys also must be backed up together. Because this is data at rest encryption, logical backups can be taken in unencrypted state. I think we would need nothing special for backups. > > (3) What does it mean for external tools (pg_waldump, pg_upgrade, > pg_rewind etc.)? I think that this definitely affects at least pg_waldump, pg_upgrade, pg_checksums and pg_rewind. By changing WAL format or giving encryption keys to these programs we can support pg_waldump and pg_rewind even for encrypted database. I prefer the former because passing encryption keys to front-end programs could be risk of key leakage. It also would affects external tools that reads or writes database file and WAL directly. For instance pg_rman, which is a recovery management tool, read database file and takes a backup without a hole in each pages. Such programs would need encryption keys. Here is the details of patches. Usage ====== To enable TDE feature please specify --with-openssl configuration option. Also, please set kmgr_plugin_library GUC parameter in postgresql.conf, which specifies the library for key managemnt program. The patch includes contrib/kmgr_file which is the test program for key management and store the master key in the local disk. So for test purpose you can set kmgr_plugin_library = 'kmgr_file'. After starting up postgres server, you can create an encrypted tablespace by specifying 'encryption' option like, CREATE TABLESPACE enctblspc LOCATION '/path/to/tblsp' WITH (encryption = on); And then the tables, indexes and TOAST tables created on the tablespace will be encrypted at rest. For system catalogs, system catalogs on pg_default and global are not encrypted. If you want to encrypt system catalogs, we need to create a database on an encrypted tablespace. During copying database file from source database we either encrypt/reencrypt each system catalogs. You can enable and disable encryption of the table by moving tablepsace between encrypted tablespace and non-encrypted tablespace. Changes ======= * 0001-Add-encryption-module-supporting-AES-256-by-using-op.patch This patch is mostly based on the patch Antoin proposed[1] but I modified some contents. This patch adds encrption function and decryption function using openssl. It currently support AES-256-XTS for buffer data encryption and AES-256-CTE for WAL encryption. * 0002-Add-kmgr-plugin-APIs.patch This patch adds new generic key managment APIs: startup, get, generate, isexist and remove. Kmgr plugin programs can define these primitive function to manage the master key that could be located at external server. The plugin program is specified by 'kmgr_plugin_library' GUC parameter, and loaded when postmaster starts up. 0003-Add-key-management-module-for-transparent-data-encry.patch This patch adds key management module, which is responsible for tablespace key management. All tablepace keys are persisted to the file on disk, called keyring file, and loaded to the hash table on the shared memory when postmatser starts up. The tablespace keys on the shared memory is not encrypted state. Whenever a encrypted tablepsace is created or dropped the keyring file is modified. Master key identifier is used as the key for the master key. It consists of system identifier and sequence number starting from 0 like 'pg_master_key-6707524-0000'. The sequence number is incremented whenever key rotation. When key rotation, we generate a new master key id in PostgreSQL core and ask the kmgr plugin to generate new master key identified by the new master key. And then update all tablespace keys in the keyring file by reencrypting with the new master key. 0004-Add-facility-to-give-process-local-encryption-key.patch This patch adds functionallity to get a process-local temporary key, which is intended to use for temporary file encryption. 0005-Encrypt-and-decrypt-data-on-encrypted-tablespace-whe.patch This patch support buffer encrption; encrypts and decrypt database data when disk I/O. It adds new smgr callbacks smgrencrypta and smgrdecrypt, and mdencrypt and mddecrypt but please note that currently the patch supports only heap and nbtree, I'm trying to support other access methods. Basically, when bufmgr reads buffer or writes buffer through the shared buffer the access methods don't need to care about the buffer encryption. However when the access methods themselves write the buffer directly to the disk it needs to call smgrencrypt. 0006-Encrypt-buffile.patch This is the patch proposed Antonin. Since I've not look the detail of this patch yet I'll look it. 0007-Make-Reorderbuffer-encrypt-spilled-out-file.patch Same as above. 0008-Support-tablespace-encryption.patch This patch adds 'encryption' option to tablespace. 0009-Add-kmgr-plugin-test-module-kmgr_file.patch This patch adds a test module for kmgr plugin. It generates random master key string and stores it to the local disk. Since this store the master key without encryption this is for test purpose. It also has TAP test for TDE. Feedback and comment are very welcome. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
- 0001-Add-encryption-module-supporting-AES-256-by-using-op.patch
- 0002-Add-kmgr-plugin-APIs.patch
- 0005-Encrypt-and-decrypt-data-on-encrypted-tablespace-whe.patch
- 0004-Add-facility-to-give-process-local-encryption-key.patch
- 0003-Add-key-management-module-for-transparent-data-encry.patch
- 0006-Encrypt-buffile.patch
- 0007-Make-Reorderbuffer-encrypt-spilled-out-file.patch
- 0008-Support-tablespace-encryption.patch
- 0009-Add-kmgr-plugin-test-module-kmgr_file.patch
pgsql-hackers by date: