XTS cipher mode for cluster file encryption - Mailing list pgsql-hackers

From Bruce Momjian
Subject XTS cipher mode for cluster file encryption
Date
Msg-id 20211013222648.GA373@momjian.us
Whole thread Raw
Responses Re: XTS cipher mode for cluster file encryption
List pgsql-hackers
As you might have seen from my email in another thread, thanks to
Stephen and Cybertec staff, I am back working on cluster file
encryption/TDE.

Stephen was going to research if XTS cipher mode would be a good fit for
this since it was determined that CTR cipher mode was too vulnerable to
IV reuse, and the LSN provides insufficient uniqueness.  Stephen
reported having trouble finding a definitive answer, so I figured I
would research it myself.

Of course, I found the same lack of information that Stephen did.  ;-)
None of my classic cryptographic books cover XTS or the XEX encryption
mode it is based on, since XTS was only standardized in 2007 and
recommended in 2010.  (Yeah, don't get me started on poor cryptographic
documentation.)

Therefore, I decide to go backward and look at CTR and CBC to see how
the nonce is used there, and if XTS fixes problems with nonce reuse.

First, I originally chose CTR mode since it was a streaming cipher, and
we therefore could skip certain page fields like the LSN.  However, CTR
is very sensitive to LSN reuse since the input bits generate encrypted
bits in exactly the same locations on the page. (It uses a simple XOR
against a cipher).  Since sometimes pages with different page contents
are encrypted with the same LSN, especially on replicas, this method
failed.

Second is CBC mode. which is a block cipher.  I thought that meant that
you could only encrypt 16-byte chunks, meaning you couldn't skip
encryption of certain page fields unless they are 16-byte chunks. 
However, there is something called ciphertext stealing
(https://en.wikipedia.org/wiki/Ciphertext_stealing#CBC_ciphertext_stealing)
which allows that.  I am not sure if OpenSSL supports this, but looking
at my OpenSSL  1.1.1d manual entry for EVP_aes, cipher stealing is only
mentioned for XTS.

Anyway, CBC mode still needs a nonce for the first 16-byte block, and
then feeds the encrypted output of the first block as a IV to the second
block, etc. This gives us the same problem with finding a nonce per
page.  However, since it is a block cipher, the bits don't output in the
same locations they have on input, so that is less of a problem.  There
is also the problem that the encrypted output from one 16-byte block
could repeat, causing leakage.

So, let's look how XTS is designed.  First, it uses two keys.  If you
are using AES128, you need _two_ 128-bit keys.  If using AES256, you
need two 256-bit keys.  The first of the two keys is used like normal,
to encrypt the data.  The second key, which is also secret, is used to
encrypt the values used for the IV for the first 16-byte block (in our
case dboid, relfilenode, blocknum, maybe LSN).  This is most clearly
explained here:

    https://www.kingston.com/unitedstates/en/solutions/data-security/xts-encryption

That IV is XOR'ed against both the input value and the encryption output
value, as explained here as key tweaking:

    https://crossbowerbt.github.io/xts_mode_tweaking.html

The purpose of using it before and after encryption is explained here:

    https://crypto.stackexchange.com/questions/24431/what-is-the-benefit-of-applying-the-tweak-a-second-time-using-xts

The second 16-byte block gets an IV that is the multiplication of the
first IV and an alpha value raised to the second power but mapped to a
finite field (Galois field, modulus a prime).  This effectively means an
attacker has _no_ idea what the IV is since it involves a secret key,
and each 16-byte block uses a different, unpredictable IV value. XTS
also supports ciphertext stealing by default so we can use the LSN if we
want, but we aren't sure we need to.

Finally, there is an interesting web page about when not to use XTS:

    https://sockpuppet.org/blog/2014/04/30/you-dont-want-xts/

Basically, what XTS does is to make the IV unknown to attackers and
non-repeating except for multiple writes to a specific 16-byte block
(with no LSN change).  What isn't clear is if repeated encryption of
different data in the same 16-byte block can leak data.

This probably needs more research and maybe we need to write something
up like the above and let security researchers review it since there
doesn't seem to be enough documentation for us to decide ourselves.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: prevent immature WAL streaming
Next
From: Thomas Munro
Date:
Subject: Re: ldap/t/001_auth.pl fails with openldap 2.5