Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers

From Sehrope Sarkuni
Subject Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Date
Msg-id CAH7T-aoF_-fkLEhaCqvJYXyEmbdq60W7QuoX2M07PksndxVj5g@mail.gmail.com
Whole thread Raw
In response to Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Sat, Jul 20, 2019 at 1:30 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Forbid checksums? I don't see how that could be acceptable. We either have
> to accept the limitations of the current design (having to decrypt
> everything before checking the checksums) or change the design.
>
> I personally think we should do the latter - not just because of this
> "decrypt-then-verify" issue, but consider how much work we've done to
> allow enabling checksums on-line (it's still not there, but it's likely
> doable in PG13). How are we going to do that with encryption? ISTM we
> should design it so that we can enable encryption on-line too - maybe not
> in v1, but it should be possible. So how are we going to do that? With
> checksums it's (fairly) easy because we can "not verify" the page before
> we know all pages have a checksum, but with encryption that's not
> possible. And if the whole page is encrypted, then what?
>
> Of course, maybe we don't need such capability for the use-case we're
> trying to solve with encryption. I can imagine that someone is running a
> large system, has issues with data corruption, and decides to enable
> checksums to remedy that. Maybe there's no such scenario in the privacy
> case? But we can probably come up with scenarios where a new company
> policy forces people to enable encryption on all systems, or something
> like that.
>
> That being said, I don't know how to solve this, but it seems to me that
> any system where we can't easily decide whether a page is encrypted or not
> (because everything including the page header) is encrypted has this
> exact issue. Maybe we could keep some part of the header unencrypted
> (likely an information leak, does not solve decrypt-then-verify). Or maybe
> we need to store some additional information on each page (which breaks
> on-disk format).

How about storing the CRC of the encrypted pages? It would not leak
any additional information and serves the same purpose as a
non-encrypted one, namely I/O corruption detection. I took a look at
pg_checksum and besides checking for empty pages, the checksum
validation path does not interpret any other fields to calculate the
checksum. I think even the offline checksum enabling path looks like
it may work out of the box. Checksums of encrypted data are not a
replacement for a MAC and this would allow that validation to run
without any knowledge of keys.

Related, I think CTR mode should be considered for pages. It has
performance advantages at 8K data sizes, but even better, allows for
arbitrary bytes of the cipher text to be replaced. For example, after
encrypting a block you can replace the two checksum bytes with the CRC
of the cipher text v.s. CBC mode where that would cause corruption to
cascade forward. Same could be used for leaving things like
pd_pagesize_version in plaintext at its current offset. For anything
deemed non-sensitive, having it readable without having to decrypt the
page is useful.

It does not have to be full bytes either. CTR mode operates as a
stream of bits so its possible to replace nibbles or even individual
bits. It can be something as small one bit for an "is_encrypted" flag
or a handful of bits used to infer a derived key. For example, with
2-bits you could have 00 represent unencrypted, 01/10 represent
old/new key, and 11 be future use. Something like that could
facilitate online key rotation.

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: SQL/JSON path issues/questions
Next
From: Steven Pousty
Date:
Subject: Re: SQL/JSON path issues/questions