Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | CAD21AoAqtytk0iH6diCJW24oyJdS4roN-VhrFD53HcNP0s8pzA@mail.gmail.com Whole thread Raw |
In response to | RE: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS) ("Moon, Insung" <Moon_Insung_i3@lab.ntt.co.jp>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
|
List | pgsql-hackers |
On Thu, Feb 7, 2019 at 9:27 AM Moon, Insung <Moon_Insung_i3@lab.ntt.co.jp> wrote: > > Dear Ibrar Ahmed. > > From: Ibrar Ahmed [mailto:ibrar.ahmad@gmail.com] > Sent: Thursday, February 07, 2019 4:09 AM > To: Moon, Insung > Cc: Tom Lane; PostgreSQL-development > Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS) > > > On Tue, Jul 3, 2018 at 5:37 PM Moon, Insung <Moon_Insung_i3@lab.ntt.co.jp> wrote: > Dear Tom Lane. > > > -----Original Message----- > > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > Sent: Monday, June 18, 2018 11:52 PM > > To: Robert Haas > > Cc: Joe Conway; Masahiko Sawada; Moon, Insung; PostgreSQL-development > > Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS) > > > > Robert Haas <robertmhaas@gmail.com> writes: > > > On Mon, Jun 18, 2018 at 10:12 AM, Joe Conway <mail@joeconway.com> wrote: > > >> Not necessarily. Our pages probably have enough predictable bytes to > > >> aid cryptanalysis, compared to user data in a column which might not > > >> be very predicable. > > > > > Really? I would guess that the amount of entropy in a page is WAY > > > higher than in an individual column value. > > > > Depending on the specifics of the encryption scheme, having some amount of known (or guessable) plaintext may allow breaking > > the cipher, even if much of the plaintext is not known. This is cryptology 101, really. > > > > At the same time, having to have a bunch of independently-decipherable short field values is not real secure either,especially > > if they're known to all be encrypted with the same key. But what you know or can guess about the plaintext in such cases > > would be target-specific, rather than an attack that could be built once and used against any PG database. > > > > Yes. If there is known to guessable data of encrypted data, maybe there is a possibility of decrypting the encrypteddata. > > > > > > But would it be safe to use an additional encryption mode such as GCM or XFS to solve this problem? > > > (Do not use the same IV) > > > Thank you and Best regards. > > > Moon. > > > > > > > > regards, tom lane > > > > > > > Hi Moon, > > > > Have you done progress on that patch? I am thinking to work on the project and found that you are already working onit. The last message is almost six months old. I want to check with you that are you still working on that, if yes I canhelp on that by reviewing the patch etc. If you are not working on that anymore, can you share your done work (if possible)? > > -- > > Ibrar Ahmed > > We are currently developing for TDE and integration KMS. > So, We will Also be prepared to start a new discussion with the PoC patch as soon as possible. > > At currently, we have changed the development direction of a per-Tablespace unit by per-table > Also, currently researching how to associate with KMIP protocol related to the encryption key for integration with KMS. > We talked about this in the Unconference session of PGConf.ASIA, > And a week ago, we talked about the development direction of TDE and integration with KMS at FOSDEM PGDAY[1]. > > We will soon provide PoC with new discussions. > > Regards. > > [1] TRANSPARENT DATA ENCRYPTION IN POSTGRESQL AND INTEGRATION WITH KEY MANAGEMENT SERVICES > https://www.postgresql.eu/events/fosdem2019/schedule/session/2307-transparent-data-encryption-in-postgresql-and-integration-with-key-management-services/ > Let me share the details of progress and current state. As the our presentation slides describes I've written the PoC code for transparent encryption that uses 2-tier key architecture and has the key rotation feature. We've been discussed the design database transparent encryption on -hackers so far and we found a good design and implementation. I will share them with our research results. But I think the design of integration of PostgreSQL with key management services(KMS) is more controvertible. For integration with KMS, I'm going to propose to add generic key management APIs to PostgreSQL core so that it can communicate with KMSs supporting different interfaces and protocols and can get the master key (of 2-tier key architecture) from them. Users can choose a key management plugin according to their enviornment. The integration of PostgreSQL with KMS should be separated patch from the TDE patch and we think that TDE can be done first. But at least it's essential to provide a way to get the master key from an external location. Therefore as the first step we can propose the basic components of TDE with a simple interface to get the master key from KMS rather than supporting full key management APIs. The basic components of TDE that we're going to propose are: * Transparent encryption at a layer between shared buffer and OS page cache * Per tablespaces encryption * 2-tier key architecture * Key rotation * System catalogs and temporary files encryption WAL encryption will follow as an additional feature. The simple interface to get the master key is a GUC parameter that can store the shell command, say get_encryption_key_command. As its names suggests, the command is used for only getting the master key, never be used for removal and registration. The slides explains about TDE feature in details but doesn't about KMS much. So let me share a rough idea of using TDE in combination with KMS. 2-Tier Key Architecture and Key Generation ================================= In our design, we use 2-tier key architecuter which uses two types keys: one master key and multiple data encryption keys. As the slides explains details, the benefit of this architecture is the fast key rotation. When the key rotation the data to re-encrypt is only data encryption keys. Key Generation Number is an integer value starting from 1, using for identifying the master key. It's initialized at initdb time and incremented whenever the master key is changed (i.g. key rotation). For each key generation number we have multiple data encryption keys associated with tablespaces. The current key generation number is written to checkpoint records. When starting up, the startup process executes the shell command set in get_encryption_key_command GUC parameter with a key generation number. For example, we can set something like get_encryption_key_command = '/bin/sh get_key_from_kms.sh %g', where '%g' is replaced with the current key generation number and where 'get_key_from_kms.sh' is an arbitary shell script to get the master key from a KMS. I assume that the master keys on KMS can be identified by its ID. So DBA generates a master key identified by the key ID in a arbitary form on KMS beforehand and the get_encryption_key_command has to crafts the key ID in the same manner and pass to the KMS. The master key we got is written to stdout. Therefore, the contract between PostgreSQL and user is, * User must prepare the master key identified by an unique key ID in advance * The shell command crafts the key ID in the same form as key ID on KMS. * User must remove old keys from KMS if necessary (because there is no interface other than getting the master key) Initial Setup and Recovery ==================== Since the user data could be encrypted we need the data encryption keys and the master key even during recovery. The get_encryption_key_command will be executed by the startup process with the key generation number written in the checkpoint record, and stores the master key to the shared memory. For example, if we crafts the master key ID in the form of 'ABC_<key generation number>', the operation steps from initdb to recovery will be followings. 1. User creates the master key of first generation with ID 'ABC_1' on KMS 2. User executes initdb and sets get_encryption_key_command = '/bin/sh get_key_from_kms.sh %g' in postgresql.conf 3. Start PostgreSQL 3-1. If transparent encryption is disabled or there is no encrypted data on database go to step #4 3-2. The startup process executes '/bin/sh get_key_from_kms.sh 1' because the current(initial) key generation is 1 3-3. The get_key_from_kms.sh crafts the key ID 'ABC_1' in the same form and get the master key from KMS 3-4. If failed, raise a FATAL error 3-5. Store the master key to the shared memory 3-6. If there is data encryption key, decrypt them using the master key 4. Recovery starts To make sure that we got the correct master key we can save the hash value of master key on the database cluster and compare them. Key Rotation =========== When user requests key rotation (via SQL command or function), the backends execute get_encryption_key_command with the new key generation number. It re-encrypts all existing data encryption keys with the new master key and increments the current key generation number. Similar to initialization time, we need to prepare the new master key on the KMS before executing the key rotation. So for example, the operation steps will be like; 1. Create the second generation master key with key ID 'ABC_2' on KMS 2. Execute key rotation on PostgreSQL (calling pg_rotate_encryption_key() function) 3-1. The backends execute '/bin/sh get_key_from_kms.sh 2', where 2 is the next key generation number 3-2. It crafts the key ID 'ABC_2' in the same manner and gets the new master key from KMS 3-3. If failed, raise an error 3-4. Re-encrypt data encryption keys using the new master key 3-5. Increment the current key generation to 2 Of course some lockings are required here. Integration with KMS ================ This above design has some restrictions on administration but might be enough for a few use case. But I think that these inconviniences will go way if we had KMS integration. Since KMIP supports some protocol for key management such as key registration and key removal, the key management plugin will be responsible for registrating the master key and getting it using the key ID generated in an unified form. So what user need to do are setting up KMS and setting up the key management plugin. User no longer needs to create and remove the master key manually. Other use case of integrating with KMS ============================== BTW we can have not only internal key management interfaces for TDE feature but also SQL interface for existing use-cases such as using pgcrypto. Currently we need to pass the password to encryption and decryption functions. SELECT decrypt(data, 'sercret-key', 'aes') FROM ...; The password will be logged to the server log when log_statement = 'all'. But with KMSs it would become, SELECT decrypt(data, get_encryption_key('keyid'), 'aes') FROM ...; where get_encryption_key() function gets the encryption key from KMS via the loaded plugin. The key string never be output to the server logs. We're still researching the details of KMIP and key managements APIs. Will share the updates. Feedback is very welcome and we're open to new idea. Thank you for reading. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: