Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Alvaro Hernandez |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | 6ae9005d-a967-f7e1-b14a-af9d7c555658@ongres.com Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
On 11/06/18 12:22, Masahiko Sawada wrote: > On Fri, May 25, 2018 at 8:41 PM, Moon, Insung > <Moon_Insung_i3@lab.ntt.co.jp> wrote: >> Hello Hackers, >> >> This propose a way to develop "Table-level" Transparent Data Encryption (TDE) and Key Management Service (KMS) supportin >> PostgreSQL. >> >> >> Issues on data encryption of PostgreSQL >> ========== >> Currently, in PostgreSQL, data encryption can be using pgcrypto Tool. >> However, it is inconvenient to use pgcrypto to encrypts data in some cases. >> >> There are two significant inconveniences. >> >> First, if we use pgcrypto to encrypt/decrypt data, we must call pgcrypto functions everywhere we encrypt/decrypt. >> Second, we must modify application program code much if we want to do database migration to PostgreSQL from other databasesthat is >> using TDE. >> >> To resolved these inconveniences, many users want to support TDE. >> There have also been a few proposals, comments, and questions to support TDE in the PostgreSQL community. >> >> However, currently PostgreSQL does not support TDE, so in development community, there are discussions whether it's necessaryto >> support TDE or not. >> >> In these discussions, there were requirements necessary to support TDE in PostgreSQL. >> >> 1) The performance overhead of encryption and decryption database data must be minimized >> 2) Need to support WAL encryption. >> 3) Need to support Key Management Service. >> >> Therefore, I'd like to propose the new design of TDE that deals with both above requirements. >> Since this feature will become very large, I'd like to hear opinions from community before starting making the patch. >> >> First, my proposal is table-level TDE which is that user can specify tables begin encrypted. >> Indexes, TOAST table and WAL associated with the table that enables TDE are also encrypted. >> >> Moreover, I want to support encryption for large object as well. >> But I haven't found a good way for it so far. So I'd like to remain it as future TODO. >> >> My proposal has five characteristics features of "table-level TDE". >> >> 1) Buffer-level data encryption and decryption >> 2) Per-table encryption >> 3) 2-tier encryption key management >> 4) Working with external key management services(KMS) >> 5) WAL encryption >> >> Here are more details for each items. >> >> >> 1. Buffer-level data encryption and decryption >> ================== >> Transparent data encryption and decryption accompany by storage operation >> With ordinally way like using pgcrypto, the biggest problem with encrypted data is the performance overhead of decryptingthe data >> each time the run to queries. >> >> My proposal is to encrypt and decrypt data when performing DISK I/O operation to minimize performance overhead. >> Therefore, the data in the shared memory layer is unencrypted so that performance overhead can minimize. >> >> With this design, data encryption/decryption implementations can be developed by modifying the codes of the storage andbuffer >> manager modules, >> which are responsible for performing DISK I/O operation. >> >> >> 2. Per-table encryption >> ================== >> User can enable TDE per table as they want. >> I introduce new storage parameter "encryption_enabled" which enables TDE at table-level. >> >> // Generate the encryption table >> CREATE TABLE foo WITH ( ENCRYPTION_ENABLED = ON ); >> >> // Change to the non-encryption table >> ALTER TABLE foo SET ( ENCRYPTION_ENABLED = OFF ); >> >> This approach minimizes the overhead for tables that do not require encryption options. >> For tables that enable TDE, the corresponding table key will be generated with random values, and it's stored into thenew system >> catalog after being encrypted by the master key. >> >> BTW, I want to support CBC mode encryption[3]. However, I'm not sure how to use the IV in CBC mode for this proposal. >> I'd like to hear opinions by security engineer. >> >> >> 3. 2-tier encryption key management >> ================== >> when it comes time to change cryptographic keys, there is a performance overhead to decryption and re-encryption to alldata. >> >> To solve this problem we employee 2-tier encryption. >> 2-tier encryption is All table keys can be stored in the database cluster after being encrypted by the master key, Andmaster keys >> must be stored at external of PostgreSQL. >> >> Therefore, without master key, it is impossible to decrypt the table key. Thus, It is impossible to decrypt the databasedata. >> >> When changing the key, it's not necessary to re-encrypt for all data. >> We use the new master key only to decrypt and re-encrypt the table key, these operations for minimizing the performanceoverhead. >> >> For table keys, all TDE-enabled tables have different table keys. >> And for master key, all database have different master keys. Table keys are encrypted by the master key of its own database. >> For WAL encryption, we have another cryptographic key. WAL-key is also encrypted by a master key, but it is shared acrossthe >> database cluster. >> >> >> 4. Working with external key management services(KMS) >> ================== >> A key management service is an integrated approach for generating, fetching and managing encryption keys for key control. >> They may cover all aspects of security from the secure generation of keys, secure storing keys, and secure fetching keysup to >> encryption key handling. >> Also, various types of KMSs are provided by many companies, and users can choose them. >> >> Therefore I would like to manage the master key using KMS. >> Also, my proposal is to create callback APIs(generate_key, fetch_key, store_key) in the form of a plug-in so that userscan use many >> types of KMS as they want. >> >> In KMIP protocol and most KMS manage keys by string IDs. We can get keys by key ID from KMS. >> So in my proposal, all master keys are distinguished by its ID, called "master key ID". >> The master key ID is made, for example, using the database oid and a sequence number, like <OID>_<SeqNo>. And they aremanaged in >> PostgreSQL. >> >> When database startup, all master key ID is loaded to shared memory, and they are protected by LWLock. >> >> When it comes time to rotate the master keys, run this query. >> >> ALTER SYSTEM ROTATION MASTER KEY; >> >> In this query, the master key is rotated with the following step. >> 1. Generate new master key, >> 2. Change master key IDs and emit corresponding WAL >> 3. Re-encrypt all table keys on its database >> >> Also during checkpoint, master key IDs on shared memory become a permanent condition. >> >> >> 5. WAL encryption >> ================== >> If we encrypt all WAL records, performance overhead can be significant. >> Therefore, this proposes a method to encrypt only WAL record excluding WAL header when writing WAL on the WAL buffer,instead of >> encrypting a whole WAL record. >> WAL encryption key is generated separately when the TDE-enabled table is created the first time. We use 2-tier encryptionfor WAL >> encryption as well. >> So, when it comes time to rotate the WAL encryption key, run this query. >> >> ALTER SYSTEM ROTATION WAL KEY; >> >> Next, I will explain how to encrypt WAL. >> >> To do this operation, I add a flag to WAL header which indicates whether the subsequent WAL data is encrypted or not. >> >> Then, when we write WAL for encryption table we write "encrypted" WAL on WAL buffer layer. >> >> In recovery, we read WAL header and check the flag of encryption, and judges whether WAL must be decrypted. >> In the case of PITR, we use WAL key ID in the backup file. >> >> With this approach, the performance overhead of writing and reading the WAL for unencrypted tables would be almost thesame as >> before. >> >> I may have missed part of the conversation and/or this may be a naïve question, but what about pg_stats? I guess data should be encrypted there too, and I wonder how this would affect the query planner and how it could decrypt this information. Also would a separate key be used for the stats? Thanks, Álvaro -- Alvaro Hernandez ----------- OnGres
pgsql-hackers by date: