Re: XTS cipher mode for cluster file encryption - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: XTS cipher mode for cluster file encryption
Date
Msg-id 871a349d-fcf6-d58f-1c77-e8510afaf043@enterprisedb.com
Whole thread Raw
In response to Re: XTS cipher mode for cluster file encryption  (Stephen Frost <sfrost@snowman.net>)
Responses Re: XTS cipher mode for cluster file encryption  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers

On 10/18/21 17:56, Stephen Frost wrote:
> Greetings,
> 
> * Tomas Vondra (tomas.vondra@enterprisedb.com) wrote:
>> On 10/15/21 21:22, Stephen Frost wrote:
>>> Now, to address the concern around re-encrypting a block with the same
>>> key+IV but different data and leaking what parts of the page changed, I
>>> do think we should use the LSN and have it change regularly (including
>>> unlogged tables) but that's just because it's relatively easy for us to
>>> do and means an attacker wouldn't be able to tell what part of the page
>>> changed when the LSN was also changed.  That was also recommended by
>>> NIST and that's a pretty strong endorsement also.
>>
>> Not sure - it seems a bit weird to force LSN change even in cases that don't
>> generate any WAL. I was not following the encryption thread and maybe it was
>> discussed/rejected there, but I've always imagined we'd have a global nonce
>> generator (similar to a sequence) and we'd store it at the end of each
>> block, or something like that.
> 
> The 'LSN' being referred to here isn't the regular LSN that is
> associated with the WAL but rather the separate FakeLSN counter which we
> already have.  I wasn't suggesting having the regular LSN change in
> cases that don't generate WAL.
> 

I'm not very familiar with FakeLSN, but isn't that just about unlogged 
tables? How does that help cases like setting hint bits, which may not 
generate WAL?

> * Robert Haas (robertmhaas@gmail.com) wrote:
>> On Fri, Oct 15, 2021 at 3:22 PM Stephen Frost <sfrost@snowman.net> wrote:
>>> Specifically: The default cipher for LUKS is nowadays aes-xts-plain64
>>>
>>> and then this:
>>>
>>> https://gitlab.com/cryptsetup/cryptsetup/-/wikis/DMCrypt
>>>
>>> where plain64 is defined as:
>>>
>>> plain64: the initial vector is the 64-bit little-endian version of the
>>> sector number, padded with zeros if necessary
>>>
>>> That is, the default for LUKS is AES, XTS, with a simple IV.  That
>>> strikes me as a pretty ringing endorsement.
>>
>> Yes, that sounds promising. It might not hurt to check for other
>> precedents as well, but that seems like a pretty good one.
>>
>> I'm not very convinced that using the LSN for any of this is a good
>> idea. Something that changes most of the time but not all the time
>> seems more like it could hurt by masking fuzzy thinking more than it
>> helps anything.
> 
> This argument doesn't come across as very strong at all to me,
> particularly when we have explicit recommendations from NIST that having
> the IV vary more is beneficial.  While this would be using the LSN, the
> fact that the LSN changes most of the time but not all of the time isn't
> new and is something we already have to deal with.  I'd think we'd
> address the concern about mis-thinking around how this works by
> providing a README and/or an appropriate set of comments around what's
> being done and why.
> 

I don't think anyone objects to varying IV more, as recommended by NIST. 
AFAICS the issue at hand is exactly the opposite - maybe not varying it 
enough, in some cases. It might be enough for MVCC purposes yet it might 
result in fatal failure of the encryption scheme. That's my concern, at 
least, and I assume it's what Robert meant by "fuzzy thinking" too.

FWIW I think we seem to be mixing nonces, IVs and tweak values. Although 
various encryption schemes place different requirements on those anyway.


> * Andres Freund (andres@anarazel.de) wrote:
>> On 2021-10-15 15:22:48 -0400, Stephen Frost wrote:
>>> * Bruce Momjian (bruce@momjian.us) wrote:
>>>> Finally, there is an interesting web page about when not to use XTS:
>>>>
>>>>     https://sockpuppet.org/blog/2014/04/30/you-dont-want-xts/
>>>
>>> This particular article always struck me as more of a reason for us, at
>>> least, to use XTS than to not- in particular the very first comment it
>>> makes, which seems to be pretty well supported, is: "XTS is the de-facto
>>> standard disk encryption mode."
>>
>> I don't find that line of argument *that* convincing. The reason XTS is the
>> de-facto standard is that for generic block layer encryption is that you can't
>> add additional data for each block without very significant overhead
>> (basically needing journaling to ensure that the data doesn't get out of
>> sync). But we don't really face the same situation - we *can* add additional
>> data.
> 
> No, we can't always add additional data, and that's part of the
> consideration for an XTS option- there are things we can do if we use
> XTS that we can't with GCM or another solution.  Specifically, being
> able to perform physical replication from an unencrypted cluster to an
> encrypted one is a worthwhile use-case that we shouldn't be just tossing
> out.
> 

Yeah, XTS seems like a reasonable first step, both because it doesn't 
require storing extra data and it's widespread use in FDE software (of 
course, there's a link between those). And I suspect replication between 
encrypted and unencrypted clusters is going to be a huge can of worms, 
even with XTS.

It's probably a silly / ugly idea, but can't we simply store a special 
"page format" flag in controldat - when set to 'true' during initdb, 
each page would have a bit of space (at the end) reserved for additional 
encryption data. Say, ~64B should be enough. On the encrypted cluster 
this would store the nonce/IV/... and on the unencrypted cluster it'd be 
simply unused. 64B seems like a negligible amount of data. And when set 
to 'false' the cluster would not allow encryption.


>> With something like AES-GCM-SIV we can use the additional data to get IV reuse
>> resistance *and* authentication. And while perhaps we are ok with the IV reuse
>> guarantees XTS has, it seems pretty clear that we'll want want guaranteed
>> authenticity at some point. And then we'll need extra data anyway.
> 
> I agree that it'd be useful to have an authenticated encryption option.
> Implementing XTS doesn't preclude us from adding that capability down
> the road and it's simpler with fewer dependencies.  These all strike me
> as good reasons to add XTS first.
> 

True. If XTS addresses the threat model we aimed to solve ...

>> Thus, to me, it doesn't seem worth going down the XTS route, just to
>> temporarily save a bit of implementation effort. We'll have to endure that
>> pain anyway.
> 
> This isn't a valid argument as it isn't just about implementation but
> about the capabilities we will have once it's done.
> 
> * Tomas Vondra (tomas.vondra@enterprisedb.com) wrote:
>> On 10/15/21 23:02, Robert Haas wrote:
>>> On Fri, Oct 15, 2021 at 3:22 PM Stephen Frost <sfrost@snowman.net> wrote:
>>>> That is, the default for LUKS is AES, XTS, with a simple IV.  That
>>>> strikes me as a pretty ringing endorsement.
>>>
>>> Yes, that sounds promising. It might not hurt to check for other
>>> precedents as well, but that seems like a pretty good one.
>>
>> TrueCrypt/VeraCrypt uses XTS too, I think. There's an overview of other FDE
>> products at [1], and some of them use XTS, but I would take that with a
>> grain of salt - some of the products are somewhat obscure, very old, or
>> both.
>>
>> What is probably more interesting is that there's an IEEE standard [2]
>> dealing with encrypted shared storage, and that uses XTS too. I'd bet
>> there's a bunch of smart cryptographers involved.
> 
> Thanks for finding those and linking to them, that's helpful.
> 
>>> I'm not very convinced that using the LSN for any of this is a good
>>> idea. Something that changes most of the time but not all the time
>>> seems more like it could hurt by masking fuzzy thinking more than it
>>> helps anything.
>>
>> I haven't been following the discussion about using LSN, but I agree that
>> while using it seems convenient, the consequences of some changes not
>> incrementing LSN seem potentially disastrous, depending on the encryption
>> mode.
> 
> Yes, this depends on the encryption mode, and is why we are specifically
> talking about XTS here as it's an encryption mode that doesn't suffer
> from this risk and therefore it's perfectly fine to use the LSN/FakeLSN
> with XTS (and would also be alright for AES-GCM-SIV as it's specifically
> designed to be resistant to IV reuse).
> 

I'm not quite sure about the "perfectly fine" bit, as it's making XTS 
vulnerable to traffic analysis attacks (comparing multiple copies of an 
encrypted block). It may be a reasonable trade-off, of course.

> * Bruce Momjian (bruce@momjian.us) wrote:
>> On Fri, Oct 15, 2021 at 10:57:03PM +0200, Tomas Vondra wrote:
>>>> That said, I don't think that's really a huge issue or something that's
>>>> a show stopper or a reason to hold off on using XTS.  Note that what
>>>> those bits actually *are* isn't leaked, just that they changed in some
>>>> fashion inside of that 16-byte cipher text block.  That they're directly
>>>> leaked with CTR is why there was concern raised about using that method,
>>>> as discussed above and previously.
>>>
>>> Yeah. With CTR you pretty learn where the hint bits are exactly, while with
>>> XTS the whole ciphertext changes.
>>>
>>> This also means CTR is much more malleable, i.e. you can tweak the
>>> ciphertext bits to flip the plaintext, while with XTS that's not really
>>> possible - it's pretty much guaranteed to break the block structure. Not
>>> sure if that's an issue for our use case, but if it is then neither of the
>>> two modes is a solution.
>>
>> Yes, this is a vary good point.  Let's look at the impact of _not_ using
>> the LSN.  For CTR (already rejected) bit changes would be visible by
>> comparing old/new page contents.  For CBC (also not under consideration)
>> the first 16-byte block would show a change, and all later 16-byte
>> blocks would show a change.  For CBC, you see the 16-byte blocks change,
>> but you have no idea how many bits were changed, and in what locations
>> in the 16-byte block (AES uses substitution and diffusion).  For XTS,
>> because earlier blocks don't change the IV used by later blocks like
>> CBC, you would be able to see each 16-byte block that changed in the 8k
>> page.  Again, you would not know the number of bits changed or their
>> locations.
>>
>> Do we think knowing which 16-byte blocks on an 8k page change would leak
>> useful information?  If so, we should use the LSN and just accept that
>> some cases might leak as described above.  If we don't care, then we can
>> skip the use of the LSN and simplify the patch.
> 
> While there may not be an active attack against PG that leverages such a
> leak, I have a hard time seeing why we would intentionally design this
> in when we have a great option that's directly available to us and
> doesn't cause such a leak with nearly such regularity as not using the
> LSN would, and also follows recommendations of using XTS from NIST.
> Further, not using the LSN wouldn't really be an option if we did
> eventually implement AES-GCM-SIV, so why not have the two cases be
> consistent?
> 

I'm a bit confused, because the question was what happens if we encrypt 
the page twice with the same LSN or any tweak value in general. It 
certainly does not matter when it comes to malleability or replay 
attacks, because in that case the attacker is the one who modifies the 
block (and obviously won't change the LSN).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: "Bossart, Nathan"
Date:
Subject: Re: relation OID in ReorderBufferToastReplace error message
Next
From: Alvaro Herrera
Date:
Subject: Re: ALTER INDEX .. RENAME allows to rename tables/views as well