Re: XTS cipher mode for cluster file encryption - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: XTS cipher mode for cluster file encryption
Date
Msg-id 20211022235105.GA26156@momjian.us
Whole thread Raw
In response to Re: XTS cipher mode for cluster file encryption  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Mon, Oct 18, 2021 at 11:56:03AM -0400, Stephen Frost wrote:
> Greetings,
> 
> * Tomas Vondra (tomas.vondra@enterprisedb.com) wrote:
> > On 10/15/21 21:22, Stephen Frost wrote:
> > >Now, to address the concern around re-encrypting a block with the same
> > >key+IV but different data and leaking what parts of the page changed, I
> > >do think we should use the LSN and have it change regularly (including
> > >unlogged tables) but that's just because it's relatively easy for us to
> > >do and means an attacker wouldn't be able to tell what part of the page
> > >changed when the LSN was also changed.  That was also recommended by
> > >NIST and that's a pretty strong endorsement also.
> > 
> > Not sure - it seems a bit weird to force LSN change even in cases that don't
> > generate any WAL. I was not following the encryption thread and maybe it was
> > discussed/rejected there, but I've always imagined we'd have a global nonce
> > generator (similar to a sequence) and we'd store it at the end of each
> > block, or something like that.
> 
> The 'LSN' being referred to here isn't the regular LSN that is
> associated with the WAL but rather the separate FakeLSN counter which we
> already have.  I wasn't suggesting having the regular LSN change in
> cases that don't generate WAL.

Yes, my original patch created dummy WAL records for dummy LSNs but that
is no longer needed with XTS.

> > I'm not very convinced that using the LSN for any of this is a good
> > idea. Something that changes most of the time but not all the time
> > seems more like it could hurt by masking fuzzy thinking more than it
> > helps anything.
> 
> This argument doesn't come across as very strong at all to me,
> particularly when we have explicit recommendations from NIST that having
> the IV vary more is beneficial.  While this would be using the LSN, the
> fact that the LSN changes most of the time but not all of the time isn't
> new and is something we already have to deal with.  I'd think we'd
> address the concern about mis-thinking around how this works by
> providing a README and/or an appropriate set of comments around what's
> being done and why.

Agreed.  I think we would need to document when we reencrypt a page
with the same LSN, and of course write-based attacks.

> > Do we think knowing which 16-byte blocks on an 8k page change would leak
> > useful information?  If so, we should use the LSN and just accept that
> > some cases might leak as described above.  If we don't care, then we can
> > skip the use of the LSN and simplify the patch.
> 
> While there may not be an active attack against PG that leverages such a
> leak, I have a hard time seeing why we would intentionally design this
> in when we have a great option that's directly available to us and
> doesn't cause such a leak with nearly such regularity as not using the
> LSN would, and also follows recommendations of using XTS from NIST.

Agreed.

> > I consider this a checkbox feature and making it too complex will cause
> > it to be rightly rejected.
> 
> Presuming that 'checkbox feature' here means "we need it to please
> $someone but no one will ever use it" or something along those lines,
> this is very clearly not the case and therefore we shouldn't be
> describing it or treating it as such.  Even if the meaning here is
> "there's other ways people could get this capability" the reality is
> that those other methods are simply not always available and in those
> cases, people will choose to not use PostgreSQL.  Nearly every other
> database system which we might compare ourselves to has a solution in
> this area and people actively use those solutions in a lot of
> deployments.

I think people will use this feature, but I called it a 'checkbox
feature' because they usually are not looking for a complex or flexible
feature, but rather something that is simple to setup and effective.

> > And if PostgreSQL is using XTS, there is no different with dm-encrypt.
> > The user can use dm-encrypt directly.
> 
> dm-encrypt is not always an option and it doesn't actually address the
> threat-model that Tomas brought up here anyway, as it would be below the
> level that the low-privileged OS user would be looking at.  That's not
> the only threat model to consider, but it is one which could potentially
> be addressed by either XTS or AES-GCM-SIV.  There are threat models
> which dm-crypt would address, of course, such as data-at-rest (hard
> drive theft, improper disposal of storage media, backups which don't
> have their own encryption, etc), but, again, dm-crypt isn't always an
> option that is available and so I don't agree that we should throw this
> out just because dm-crypt exists and may be useable in some cases.

I actually think a Postgres integrity-check feature would need to create
an abstraction layer on top of all writes to PGDATA and tablespaces so
the filesystem would look unencrypted to Postgres.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_dump versus ancient server versions
Next
From: Alvaro Herrera
Date:
Subject: Re: pg_dump versus ancient server versions