Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers

From Alvaro Hernandez
Subject Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Date
Msg-id 6ae9005d-a967-f7e1-b14a-af9d7c555658@ongres.com
Whole thread Raw
In response to Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers

On 11/06/18 12:22, Masahiko Sawada wrote:
> On Fri, May 25, 2018 at 8:41 PM, Moon, Insung
> <Moon_Insung_i3@lab.ntt.co.jp> wrote:
>> Hello Hackers,
>>
>> This propose a way to develop "Table-level" Transparent Data Encryption (TDE) and Key Management Service (KMS)
supportin
 
>> PostgreSQL.
>>
>>
>> Issues on data encryption of PostgreSQL
>> ==========
>> Currently, in PostgreSQL, data encryption can be using pgcrypto Tool.
>> However, it is inconvenient to use pgcrypto to encrypts data in some cases.
>>
>> There are two significant inconveniences.
>>
>> First, if we use pgcrypto to encrypt/decrypt data, we must call pgcrypto functions everywhere we encrypt/decrypt.
>> Second, we must modify application program code much if we want to do database migration to PostgreSQL from other
databasesthat is
 
>> using TDE.
>>
>> To resolved these inconveniences, many users want to support TDE.
>> There have also been a few proposals, comments, and questions to support TDE in the PostgreSQL community.
>>
>> However, currently PostgreSQL does not support TDE, so in development community, there are discussions whether it's
necessaryto
 
>> support TDE or not.
>>
>> In these discussions, there were requirements necessary to support TDE in PostgreSQL.
>>
>> 1) The performance overhead of encryption and decryption database data must be minimized
>> 2) Need to support WAL encryption.
>> 3) Need to support Key Management Service.
>>
>> Therefore, I'd like to propose the new design of TDE that deals with both above requirements.
>> Since this feature will become very large, I'd like to hear opinions from community before starting making the
patch.
>>
>> First, my proposal is table-level TDE which is that user can specify tables begin encrypted.
>> Indexes, TOAST table and WAL associated with the table that enables TDE are also encrypted.
>>
>> Moreover, I want to support encryption for large object as well.
>> But I haven't found a good way for it so far. So I'd like to remain it as future TODO.
>>
>> My proposal has five characteristics features of "table-level TDE".
>>
>> 1) Buffer-level data encryption and decryption
>> 2) Per-table encryption
>> 3) 2-tier encryption key management
>> 4) Working with external key management services(KMS)
>> 5) WAL encryption
>>
>> Here are more details for each items.
>>
>>
>> 1. Buffer-level data encryption and decryption
>> ==================
>> Transparent data encryption and decryption accompany by storage operation
>> With ordinally way like using pgcrypto, the biggest problem with encrypted data is the performance overhead of
decryptingthe data
 
>> each time the run to queries.
>>
>> My proposal is to encrypt and decrypt data when performing DISK I/O operation to minimize performance overhead.
>> Therefore, the data in the shared memory layer is unencrypted so that performance overhead can minimize.
>>
>> With this design, data encryption/decryption implementations can be developed by modifying the codes of the storage
andbuffer
 
>> manager modules,
>> which are responsible for performing DISK I/O operation.
>>
>>
>> 2. Per-table encryption
>> ==================
>> User can enable TDE per table as they want.
>> I introduce new storage parameter "encryption_enabled" which enables TDE at table-level.
>>
>>      // Generate  the encryption table
>>         CREATE TABLE foo WITH ( ENCRYPTION_ENABLED = ON );
>>
>>      // Change to the non-encryption table
>>         ALTER TABLE foo SET ( ENCRYPTION_ENABLED = OFF );
>>
>> This approach minimizes the overhead for tables that do not require encryption options.
>> For tables that enable TDE, the corresponding table key will be generated with random values, and it's stored into
thenew system
 
>> catalog after being encrypted by the master key.
>>
>> BTW, I want to support CBC mode encryption[3]. However, I'm not sure how to use the IV in CBC mode for this
proposal.
>> I'd like to hear opinions by security engineer.
>>
>>
>> 3. 2-tier encryption key management
>> ==================
>> when it comes time to change cryptographic keys, there is a performance overhead to decryption and re-encryption to
alldata.
 
>>
>> To solve this problem we employee 2-tier encryption.
>> 2-tier encryption is All table keys can be stored in the database cluster after being encrypted by the master key,
Andmaster keys
 
>> must be stored at external of PostgreSQL.
>>
>> Therefore, without master key, it is impossible to decrypt the table key. Thus, It is impossible to decrypt the
databasedata.
 
>>
>> When changing the key, it's not necessary to re-encrypt for all data.
>> We use the new master key only to decrypt and re-encrypt the table key, these operations for minimizing the
performanceoverhead.
 
>>
>> For table keys, all TDE-enabled tables have different table keys.
>> And for master key, all database have different master keys. Table keys are encrypted by the master key of its own
database.
>> For WAL encryption, we have another cryptographic key. WAL-key is also encrypted by a master key, but it is shared
acrossthe
 
>> database cluster.
>>
>>
>> 4. Working with external key management services(KMS)
>> ==================
>> A key management service is an integrated approach for generating, fetching and managing encryption keys for key
control.
>> They may cover all aspects of security from the secure generation of keys, secure storing keys, and secure fetching
keysup to
 
>> encryption key handling.
>> Also, various types of KMSs are provided by many companies, and users can choose them.
>>
>> Therefore I would like to manage the master key using KMS.
>> Also, my proposal is to create callback APIs(generate_key, fetch_key, store_key) in the form of a plug-in so that
userscan use many
 
>> types of KMS as they want.
>>
>> In KMIP protocol and most KMS manage keys by string IDs. We can get keys by key ID from KMS.
>> So in my proposal, all master keys are distinguished by its ID, called "master key ID".
>> The master key ID is made, for example, using the database oid and a sequence number, like <OID>_<SeqNo>. And they
aremanaged in
 
>> PostgreSQL.
>>
>> When database startup, all master key ID is loaded to shared memory, and they are protected by LWLock.
>>
>> When it comes time to rotate the master keys, run this query.
>>
>>          ALTER SYSTEM ROTATION MASTER KEY;
>>
>> In this query, the master key is rotated with the following step.
>> 1. Generate new master key,
>> 2. Change master key IDs and emit corresponding WAL
>> 3. Re-encrypt all table keys on its database
>>
>> Also during checkpoint, master key IDs on shared memory become a permanent condition.
>>
>>
>> 5. WAL encryption
>> ==================
>> If we encrypt all WAL records, performance overhead can be significant.
>> Therefore, this proposes a method to encrypt only WAL record excluding WAL header when writing WAL on the WAL
buffer,instead of
 
>> encrypting a whole WAL record.
>> WAL encryption key is generated separately when the TDE-enabled table is created the first time. We use 2-tier
encryptionfor WAL
 
>> encryption as well.
>> So, when it comes time to rotate the WAL encryption key, run this query.
>>
>>          ALTER SYSTEM ROTATION WAL KEY;
>>
>> Next, I will explain how to encrypt WAL.
>>
>> To do this operation, I add a flag to WAL header which indicates whether the subsequent WAL data is encrypted or
not.
>>
>> Then, when we write WAL for encryption table we write "encrypted" WAL on WAL buffer layer.
>>
>> In recovery, we read WAL header and check the flag of encryption, and judges whether WAL must be decrypted.
>> In the case of PITR, we use WAL key ID in the backup file.
>>
>> With this approach, the performance overhead of writing and reading the WAL for unencrypted tables would be almost
thesame as
 
>> before.
>>
>>

     I may have missed part of the conversation and/or this may be a 
naïve question, but what about pg_stats? I guess data should be 
encrypted there too, and I wonder how this would affect the query 
planner and how it could decrypt this information. Also would a separate
key be used for the stats?


     Thanks,

     Álvaro


-- 

Alvaro Hernandez


-----------
OnGres



pgsql-hackers by date:

Previous
From: Alvaro Hernandez
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Next
From: Stephen Frost
Date:
Subject: Re: Tips on committing