Re: XTS cipher mode for cluster file encryption - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: XTS cipher mode for cluster file encryption |
Date | |
Msg-id | 80dddd73-ae23-0481-53ad-d37f8b51c86a@enterprisedb.com Whole thread Raw |
In response to | Re: XTS cipher mode for cluster file encryption (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: XTS cipher mode for cluster file encryption
Re: XTS cipher mode for cluster file encryption |
List | pgsql-hackers |
On 10/15/21 21:22, Stephen Frost wrote: > Greetings, > > * Bruce Momjian (bruce@momjian.us) wrote: >> As you might have seen from my email in another thread, thanks to >> Stephen and Cybertec staff, I am back working on cluster file >> encryption/TDE. >> >> Stephen was going to research if XTS cipher mode would be a good fit for >> this since it was determined that CTR cipher mode was too vulnerable to >> IV reuse, and the LSN provides insufficient uniqueness. Stephen >> reported having trouble finding a definitive answer, so I figured I >> would research it myself. >> >> Of course, I found the same lack of information that Stephen did. ;-) >> None of my classic cryptographic books cover XTS or the XEX encryption >> mode it is based on, since XTS was only standardized in 2007 and >> recommended in 2010. (Yeah, don't get me started on poor cryptographic >> documentation.) >> >> Therefore, I decide to go backward and look at CTR and CBC to see how >> the nonce is used there, and if XTS fixes problems with nonce reuse. >> >> First, I originally chose CTR mode since it was a streaming cipher, and >> we therefore could skip certain page fields like the LSN. However, CTR >> is very sensitive to LSN reuse since the input bits generate encrypted >> bits in exactly the same locations on the page. (It uses a simple XOR >> against a cipher). Since sometimes pages with different page contents >> are encrypted with the same LSN, especially on replicas, this method >> failed. >> >> Second is CBC mode. which is a block cipher. I thought that meant that >> you could only encrypt 16-byte chunks, meaning you couldn't skip >> encryption of certain page fields unless they are 16-byte chunks. >> However, there is something called ciphertext stealing >> (https://en.wikipedia.org/wiki/Ciphertext_stealing#CBC_ciphertext_stealing) >> which allows that. I am not sure if OpenSSL supports this, but looking >> at my OpenSSL 1.1.1d manual entry for EVP_aes, cipher stealing is only >> mentioned for XTS. >> >> Anyway, CBC mode still needs a nonce for the first 16-byte block, and >> then feeds the encrypted output of the first block as a IV to the second >> block, etc. This gives us the same problem with finding a nonce per >> page. However, since it is a block cipher, the bits don't output in the >> same locations they have on input, so that is less of a problem. There >> is also the problem that the encrypted output from one 16-byte block >> could repeat, causing leakage. >> >> So, let's look how XTS is designed. First, it uses two keys. If you >> are using AES128, you need _two_ 128-bit keys. If using AES256, you >> need two 256-bit keys. The first of the two keys is used like normal, >> to encrypt the data. The second key, which is also secret, is used to >> encrypt the values used for the IV for the first 16-byte block (in our >> case dboid, relfilenode, blocknum, maybe LSN). This is most clearly >> explained here: >> >> https://www.kingston.com/unitedstates/en/solutions/data-security/xts-encryption >> >> That IV is XOR'ed against both the input value and the encryption output >> value, as explained here as key tweaking: >> >> https://crossbowerbt.github.io/xts_mode_tweaking.html >> >> The purpose of using it before and after encryption is explained here: >> >> https://crypto.stackexchange.com/questions/24431/what-is-the-benefit-of-applying-the-tweak-a-second-time-using-xts >> >> The second 16-byte block gets an IV that is the multiplication of the >> first IV and an alpha value raised to the second power but mapped to a >> finite field (Galois field, modulus a prime). This effectively means an >> attacker has _no_ idea what the IV is since it involves a secret key, >> and each 16-byte block uses a different, unpredictable IV value. XTS >> also supports ciphertext stealing by default so we can use the LSN if we >> want, but we aren't sure we need to. > > Yeah, this all seems to be about where I got to too. > >> Finally, there is an interesting web page about when not to use XTS: >> >> https://sockpuppet.org/blog/2014/04/30/you-dont-want-xts/ > > This particular article always struck me as more of a reason for us, at > least, to use XTS than to not- in particular the very first comment it > makes, which seems to be pretty well supported, is: "XTS is the de-facto > standard disk encryption mode." Much of the rest of it is the well > trodden discussion we've had about how FDE (or TDE in our case) doesn't > protect against all the attack vectors that sometimes people think it > does. Another point is that XTS isn't authenticated- something else we > know quite well around here and isn't news. > >> Basically, what XTS does is to make the IV unknown to attackers and >> non-repeating except for multiple writes to a specific 16-byte block >> (with no LSN change). What isn't clear is if repeated encryption of >> different data in the same 16-byte block can leak data. > > Any time a subset of the data is changed but the rest of it isn't, > there's a leak of information. This is a really good example of exactly > what that looks like: > > https://github.com/robertdavidgraham/ecb-penguin > > In our case, if/when this happens (no LSN change, repeated encryption > of the same block), someone might be able to deduce that hint bits were > being updated/changed, and where some of those are in the block. > > That said, I don't think that's really a huge issue or something that's > a show stopper or a reason to hold off on using XTS. Note that what > those bits actually *are* isn't leaked, just that they changed in some > fashion inside of that 16-byte cipher text block. That they're directly > leaked with CTR is why there was concern raised about using that method, > as discussed above and previously. > Yeah. With CTR you pretty learn where the hint bits are exactly, while with XTS the whole ciphertext changes. This also means CTR is much more malleable, i.e. you can tweak the ciphertext bits to flip the plaintext, while with XTS that's not really possible - it's pretty much guaranteed to break the block structure. Not sure if that's an issue for our use case, but if it is then neither of the two modes is a solution. >> This probably needs more research and maybe we need to write something >> up like the above and let security researchers review it since there >> doesn't seem to be enough documentation for us to decide ourselves. > > The one issue identified here is hopefully answered above and given that > what you've found matches what I found, I'd argue that moving forward > with XTS makes sense. > +1 > The other bit of research that I wanted to do, and thanks for sending > this and prodding me to go do so, was to look at other implementations > and see what they do for the IV when it comes to using XTS, and this is > what I found: > > https://wiki.gentoo.org/wiki/Dm-crypt_full_disk_encryption > > Specifically: The default cipher for LUKS is nowadays aes-xts-plain64 > > and then this: > > https://gitlab.com/cryptsetup/cryptsetup/-/wikis/DMCrypt > > where plain64 is defined as: > > plain64: the initial vector is the 64-bit little-endian version of the > sector number, padded with zeros if necessary > > That is, the default for LUKS is AES, XTS, with a simple IV. That > strikes me as a pretty ringing endorsement. > Seems reasonable, on the assumption the threat models are the same. > Now, to address the concern around re-encrypting a block with the same > key+IV but different data and leaking what parts of the page changed, I > do think we should use the LSN and have it change regularly (including > unlogged tables) but that's just because it's relatively easy for us to > do and means an attacker wouldn't be able to tell what part of the page > changed when the LSN was also changed. That was also recommended by > NIST and that's a pretty strong endorsement also. > Not sure - it seems a bit weird to force LSN change even in cases that don't generate any WAL. I was not following the encryption thread and maybe it was discussed/rejected there, but I've always imagined we'd have a global nonce generator (similar to a sequence) and we'd store it at the end of each block, or something like that. > I'm all for getting security folks and whomever else to come and review > this thread and chime in with their thoughts, but I don't think it's a > reason to hold off on moving forward with the approach that we've been > converging towards. > +1 regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: