Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | CAD21AoBM=CymUMrJuaio_=qsxm7wR9XkBATcnWyM9a1kgL+pmg@mail.gmail.com Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
|
List | pgsql-hackers |
On Fri, Jul 12, 2019 at 7:37 AM Bruce Momjian <bruce@momjian.us> wrote: > > On Wed, Jul 10, 2019 at 12:26:24PM -0400, Bruce Momjian wrote: > > On Wed, Jul 10, 2019 at 08:31:17AM -0400, Joe Conway wrote: > > > Please see my other reply (and > > > https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38a.pdf > > > appendix C as pointed out by Ryan downthread). > > > > > > At least in my mind, I trust a published specification from the > > > nation-state level over random blogs or wikipedia. If we can find some > > > equivalent published standards that contradict NIST we should discuss > > > it, but for my money I would prefer to stick with the NIST recommended > > > method to produce the IVs. > > > > So, we have had a flurry of activity on this thread in the past day, so > > let me summarize: > > Seems we have an updated approach: > > First, we need to store the symmetric encryption key in the data > directory, like we do for SSL certificates and private keys. (Crash > recovery needs access to this key, so we can't easily store it in a > database table.) We will pattern it after the GUC > ssl_passphrase_command. We will need to decide on a format for the > symmetric encryption key in the file so we can check that the supplied > passphrase properly unlocks the key. > > Our first implementation will encrypt the entire cluster. We can later > consider encryption per table or tablespace. It is unclear if > encrypting different parts of the system with different keys is useful > or feasible. (This is separate from key rotation.) > > We will use CBC AES128 mode for tables/indexes, and CTR AES128 for WAL. > 8k pages will use the LSN as a nonce, which will be encrypted to > generate the initialization vector (IV). We will not encrypt the first > 16 bytes of each pages so the LSN can be used in this way. The WAL will > use the WAL file segment number as the nonce and the IV will be created > in the same way. > > wal_log_hints will be enabled automatically in encryption mode, like we > do for checksum mode, so we never encrypt different 8k pages with the > same IV. > > There will need to be a pg_control field to indicate that encryption is > in use. > > Right now we don't support the online changing of a cluster's checksum > mode, so I suggest we create a utility like pg_checksums --enable to > allow offline key rotation. Once we get online checksum mode changing > ability, we can look into use that for encryption key rotation. > I've re-considered the design of TDE feature based on the discussion so far. The one of the main open question is the granular of encryption objects: cluster encryption or more-granular-than-cluster encryption. The followings describe about the new TDE design when we choose table-level encryption or something-new-group-level encryption. General ======== We will use AES and support both AES-128 and AES-256. User can specify the new initdb option something like --aes-128 or --aes-256 to enable encryption and must specify --encryption-key-passphrase-command along with. (I guess we also require openssl library.) If these options are specified, we write the key length to the control file and derive the KEK and generate MDEK during initdb. wal_log_hints will be enabled automatically in encryption mode, like we do for checksum mode, Key Management ============== We will use 3-tier key architecture as Joe proposed. 1. A master key encryption key (KEK): this is the ley supplied by the database admin using something akin to ssl_passphrase_command 2. A master data encryption key (MDEK): this is a generated key using a cryptographically secure pseudo-random number generator. It is encrypted using the KEK, probably with Key Wrap (KW): or maybe better Key Wrap with Padding (KWP): 3a. Per table data encryption keys (TDEK): use MDEK and HKDF to generate table specific keys. 3b. WAL data encryption keys (WDEK): Similarly use MDEK and a HKDF to generate new keys when needed for WAL. We store MDEK to the plain file (say global/pgkey) after encrypted with the KEK. I might want to store the hash of passphrase of the KEK in order to verify the correctness of the given passphrase. However we don't need to store TDEK and WDEK as we can derive them as needed. The key file can be read by both backend processes and front-end tools. When postmaster startup, it reads the key file and decrypts MDEK and derive WDEK using key id for WDEK. WDEK is loaded to the key hash map (keyid -> key) on the shared memory. Also we derive TDEK as needed when reading tables or indexes and add it to the key hash map as well if not exists. Buffer Encryption ============== We will use AES-CBC for buffer encryption. We will add key id (4byte) to after the pd_lsn(8byte) in PageHeaderData and we will not encrypt first 16 byte of each pages so the LSN and key id can be used. We can store an invalid key id to tell us that the table is not encrypted. There two benefits of storing key id to the page header: offline tools can get key id (and know the table is encrypted or not) and it's helpful for online rekey in the future. I've considered to store IV and key id to a new fork but I felt that it is complex because we will always need to have the fork on the shared buffer when any pages of its main fork is written to the disk. If almost buffers of the shared buffers are dirtied and theirs new forks are not loaded to the shared buffer, we might need to load the new fork and write the page to the disk and then evict some pages, over and over. We will use (page lsn, page number) to create a nonce. IVs are created by encrypting the nonce with its TDEK. WAL Encryption ============= We will use AES-CTR for WAL encryption and encrypt each WAL pages with WDEK. We will use WAL segment number to create a nonce. Similar to buffer encryption, IVs are created using by the nonce and WDEK. If we want to support enabling or disabling encryption after initdb we might want to have key id in the WAL page header. Front-end Tool Support ================== We will add --encryption-key-passphrase-command option to the front-end tools that read database files or WAL segment files directly. They can get KEK via --encryption-key-passphrase-command and get MDEK by reading the key file. Also they can know the key length by checking the control file. Since they can derive TDEK using by key id stored in the page header they can decrypt database files. Similarly, they also can decrypt WAL as they can know the key id of WDEK. Master Key Rotation ================ We will support new command-line tool that rotates the master key offline. It accepts --old-encryption-key-passphrase-command option and --new-encryption-key-passphrase-command to get old KEK and new KEK respectively. It decrypt MDEK with the old key and encrypt it with the new key. There is concern about the performance overhead by both looking up keys and checking if the object is encrypted or not but with this design we can know them by reading page and by checking hash map on the shared memory. It works fine unless we have to have a huge number of keys in the hash map. So I guess the overhead doesn't become obvious. In addition, this key management design is similar to the PoC patch I created before and evaluated its the performance overhead a few month ago. In the evaluation, I didn't see such overhead. See [1]. [1] https://www.slideshare.net/masahikosawada98/transparent-data-encryptoin-in-postgresql-and-integratino-with-key-management-service/31 Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: