Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Sehrope Sarkuni |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | CAH7T-are4DWvunDWknRcUGGLgv8H2FgPmQ8TTajCoztEadK+iA@mail.gmail.com Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
List | pgsql-hackers |
Hi, Some more thoughts on CBC vs CTR modes. There are a number of advantages to using CTR mode for page encryption. CTR encryption modes can be fully parallelized, whereas CBC can only parallelized for decryption. While both can use AES specific hardware such as AES-NI, CTR modes can go a step further and use vectorized instructions. On an i7-8559U (with AES-NI) I get a 4x speed improvement for CTR-based modes vs CBC when run on 8K of data: # openssl speed -evp ${cipher} type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-128-cbc 1024361.51k 1521249.60k 1562033.41k 1571663.87k 1574537.90k 1575512.75k aes-128-ctr 696866.85k 2214441.86k 4364903.85k 5896221.35k 6559735.81k 6619594.75k aes-128-gcm 642758.92k 1638619.09k 3212068.27k 5085193.22k 6366035.97k 6474006.53k aes-256-cbc 940906.25k 1114628.44k 1131255.13k 1138385.92k 1140258.13k 1143592.28k aes-256-ctr 582161.82k 1896409.32k 3216926.12k 4249708.20k 4680299.86k 4706375.00k aes-256-gcm 553513.89k 1532556.16k 2705510.57k 3931744.94k 4615812.44k 4673093.63k For relation data where the encryption is going to be per page, there's flexibility in how the CTR nonce (IV + counter) is generated. With an 8K page, the counter need only go up to 512 for each page (8192-bytes per page / 16-bytes per AES-block). That would require 9-bits for the counter. Rounding that up to 16-bits allows for wider pages and it still uses only two bytes of the counter while ensuring that it'd be unique per AES-block. The remaining 14-bytes would be populated with some other data that is guaranteed unique per page-write to allow encryption via the same per-relation-file derived key. From what I gather, the LSN is a candidate though it'd have to be stored in plaintext for decryption. What's important is that writing the two pages (either different locations or the same page back again) never reuses the same nonce with the same key. Using the same nonce with a different key is fine. With any of these schemes the same inputs will generate the same outputs. With CTR mode for WAL this would be an issue if the same key and deterministic nonce (ex: LSN + offset) is reused in multiple places. That does not have to be the same cluster either. For example if two replicas are promoted from the same backup with the same master key, they would generate the same WAL CTR stream, reusing the key/nonce pair. Ditto for starting off with a master key and deriving per-relation keys in a cloned installation off some deterministic attribute such as oid. This can be avoided by deriving new keys per file (not just per relation) from a random salt. It'd be stored out of band and combined with the master key to derive the specific key used for that CTR stream. If there's a desire for supporting multiple ciphers or key sizes, that could be stored alongside the salt. Perhaps use the same location or lack of it to indicate "not encrypted" as well. Per-file salts and derived keys would facilitate re-keying a table piecemeal, file by file, by generating a new salt/derived-key, encrypting a copy of the decrypted file, and doing an atomic rename. The files contents would change but its length and any references to pages or byte offsets would stay valid. (I think this would work for CBC modes too as there's nothing CTR specific about it.) I'm not sure of is how to handle randomizing the relation file IV in a cloned database. Until the key for a relation file or segment is rotated it'd have the same deterministic IV generated as its source as the LSN would continue from the same point. One idea is with 128-bits for the IV, one could have 64-bits for LSN, 16-bits for AES-block counter, and the remaining 48-bits be randomized; though you'd need to store those 48-bits somewhere per-page (basically it's a salt per page). That'd give some protection from the clone's new data be encrypted with the same stream as the parent's. Another option would be to track ranges of LSNs and have a centralized list of 48-bit randomized salts. That would remove the need for additional salt per page though you'd have to do a lookup on that shared list to figure out which to use. CTR mode is definitely more complicated than a pure random-IV + CBC but with any deterministic generation of IVs for CBC mode you're going to have some of these same problems as well. Regarding CRCs, CTR mode has the advantage of not destroying the rest of the stream to replace the CRC bytes. With CBC mode any change would cascade and corrupt the rest of data the down stream from that block. With CTR mode you can overwrite the CRC's location with the CRC or a truncated MAC of the encrypted data as each byte is encrypted separately. At decryption time you simply ignore the decrypted output of those bytes and zero them out again. A CRC of encrypted data (but not a partial MAC) could be checked offline without access to the key. Regards, -- Sehrope Sarkuni Founder & CEO | JackDB, Inc. | https://www.jackdb.com/
pgsql-hackers by date: