Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) |
Date | |
Msg-id | 20190715235637.4ilfnarbmbawryg6@development Whole thread Raw |
In response to | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
|
List | pgsql-hackers |
On Mon, Jul 15, 2019 at 06:11:41PM -0400, Bruce Momjian wrote: >On Mon, Jul 15, 2019 at 11:05:30PM +0200, Tomas Vondra wrote: >> On Mon, Jul 15, 2019 at 03:42:39PM -0400, Bruce Momjian wrote: >> > On Sat, Jul 13, 2019 at 11:58:02PM +0200, Tomas Vondra wrote: >> > > One extra thing we should consider is authenticated encryption. We can't >> > > just encrypt the pages (no matter which AES mode is used - XTS/CBC/...), >> > > as that does not provide integrity protection (i.e. can't detect when >> > > the ciphertext was corrupted due to disk failure or intentionally). And >> > > we can't quite rely on checksums, because that checksums the plaintext >> > > and is stored encrypted. >> > >> > Uh, if someone modifies a few bytes of the page, we will decrypt it, but >> > the checksum (per-page or WAL) will not match our decrypted output. How >> > would they make it match the checksum without already knowing the key. >> > I read [1] but could not see that explained. >> > >> >> Our checksum is only 16 bits, so perhaps one way would be to just >> generate 64k of randomly modified pages and hope one of them happens to >> hit the right checksum value. Not sure how practical such attack is, but >> it does require just filesystem access. > >Yes, that would work, and opens the question of whether our checksum is >big enough for this, and if it is not, we need to find space for it, >probably with a custom encrypted page format. :-( And that makes >adding encryption offline almost impossible because you potentially have >to move tuples around. Yuck! > Right. We've been working on allowing to disable checksum online, and it would be useful to allow something like that for encryption too I guess. And without some sort of page-level flag that won't be possible, which would be rather annoying. Not sure it needs to be in the page itself, though - that's pretty much why I proposed to store metadata (IV, key ID, ...) for encryption in a new fork. That would be a bit more flexible than storing it in the page itself (e.g. different encryption schemes might easily store different amounts of metadata). Maybe a new fork is way too complex solution, not sure. >> FWIW our CRC algorithm is not quite HMAC, because it's neither keyed nor >> a cryptographic hash algorithm. Now, maybe we don't want authenticated >> encryption (e.g. XTS is not authenticated, unlike GCM/CCM). > >I thought just encrypting the CRC value would be enough to detect >changes, but you are right that some you could just do 64k pages until >one hit. > Right. Not sure that's really a practical attack we need to worry about, considering all of this is vulnerable to replay attacks. >> > This post discussed it: >> > >> > https://crypto.stackexchange.com/questions/202/should-we-mac-then-encrypt-or-encrypt-then-mac >> > >> > I realize in a new system we might prefer encrypt-then-mac, TLS and SSL >> > do it differently, and I don't think the security problems of >> > MAC-then-Encrypt apply to our use-case, e.g. API programming errors. >> > >> > If we want to go crazy, we could encrypt, assume zeros for the CRC, >> > compute the MAC and put it in the place of the CRC is, but then tools >> > that read CRC would see that as an error, so we don't want to go there. >> > Yes, crazy. >> > >> > > Which seems pretty annoying, because then the checksums won't verify >> > > data as sent to the storage system, and verify checksums would require >> > > access to all keys (how do you do that in offline mode?). >> > >> > Uh, the keys are stored in a PGDATA file --- seems simple enough, but we >> > would either have to do whole-cluster encryption or have some per-page >> > encryption flag. >> > >> >> And how do you know which files are encrypted and which are not, and >> which keys are used for which file? Presumably that's in some system >> catalog, which is not available in offline mode. > >You would need either all-cluster encryption (no need to check) or a >per-page bit that says the page is encrypted, and the bit has to be in >the part of the page that is not encryped, e.g., near LSN. > >> > > But the main issue with checksum-then-encrypt is it's essentially >> > > "MAC-then-Encrypt" and that does not provide Authenticated Encryption >> > > security - see [1]. We should be looking at "Encrypt-then-MAC" instead, >> > > in which case we'll need to store the MAC somewhere (probably in the >> > > same place as the nonce/IV/key/... for each page). >> > >> > I don't think we are planning to store the nonce/IV on each page but >> > rather use the LSN (already on the page), and perhaps in addition, the >> > page number. >> >> But the LSN is in the page header, and AFAICS the page header is >> encrypted. So how do you decrypt the page without knowing the LSN (which >> I think you need to know in otder to derive the IV)? > >My poposal was that the first 16 bytes of the page are not encrypted. > Ah, I see. >> Also, we probably don't want to expose the checksum, because it may >> reveal information about page contents (since it's not a HMAC). > >Uh, I have not heard of that as an issue. > To clarify, I think it's more a general issue - the checksum does leak a bit of information about the plaintext, I think that's fairly obvious. I don't know if 16 bits is enough for practical attacks, though. But it clearly is not the same thing as HMAC, so we should not treat it as such. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: