Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Date
Msg-id 20190807173856.jz5g3esknlovunxw@momjian.us
Whole thread Raw
In response to Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Sehrope Sarkuni <sehrope@jackdb.com>)
Responses Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
List pgsql-hackers
On Wed, Aug  7, 2019 at 11:41:51AM -0400, Sehrope Sarkuni wrote:
> On Wed, Aug 7, 2019 at 7:19 AM Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Wed, Aug  7, 2019 at 05:13:31PM +0900, Masahiko Sawada wrote:
>     > I understood. IIUC in your approach postgres processes encrypt WAL
>     > records when inserting to the WAL buffer. So WAL data is encrypted
>     > even on the WAL buffer.
> 
> 
> I was originally thinking of not encrypting the shared WAL buffers but that may
> have issues. If the buffers are already encrypted and contiguous in shared
> memory, it's possible to write out many via a single pg_pwrite(...) call as is
> currently done in XLogWrite(...).

The shared buffers will not be encrypted --- they are encrypted only
when being written to storage.  We felt encrypting shared buffers will
be too much overhead, for little gain.  I don't know if we will encrypt
while writing to the WAL buffers or while writing the WAL buffers to
the file system.

> If they're not encrypted you'd need to do more work in that critical section.
> That'd involve allocating a commensurate amount of memory to hold the encrypted
> pages and then encrypting them all prior to the single pg_pwrite(...) call.
> Reusing one buffer is possible but it would require encrypting and writing the
> pages one by one. Both of those seem like a bad idea.

Well, right now the 8k pages is part of the WAL stream, so I don't know
it would be any more overhead than other WAL writes.  I am hoping we can
generate the encryption bit stream in chunks earlier so we can just to
the XOR was we are writing the data to the WAL buffers.

> Better to pay the encryption cost at the time of WAL record creation and keep
> the writing process as fast and simple as possible.

Yes, I don't think we know at the time of WAL record creation what
_offset_ the records will have when then are written to WAL, so I am
thinking we need to do it later, and as I said, I am hoping we can
generate the encryption bit stream earlier.

>     > It works but I think the implementation might be complex; For example
>     > using openssl, we would use EVP functions to encrypt data by
>     > AES-256-CTR. We would need to make IV and pass it to them and these
>     > functions however don't manage the counter value of nonce as long as I
>     > didn't miss. That is, we need to calculate the correct counter value
>     > for each encryption and pass it to EVP functions. Suppose we encrypt
>     > 20 bytes of WAL. The first 16 bytes is encrypted with nonce of
>     > (segment_number, 0) and the next 4 bytes is encrypted with nonce of
>     > (segment_number, 1). After that suppose we encrypt 12 bytes of WAL. We
>     > cannot use nonce of (segment_number, 2) but should use nonce of
>     > (segment_number , 1). Therefore we would need 4 bytes padding and to
>     > encrypt it and then to throw that 4 bytes away .
> 
>     Since we want to have per-byte control over encryption, for both
>     heap/index pages (skip LSN and CRC), and WAL (encrypt to the last byte),
>     I assumed we would need to generate a bit stream of a specified size and
>     do the XOR ourselves against the data.  I assume ssh does this, so we
>     would have to study the method.
> 
> 
> The lower level non-EVP OpenSSL functions allow specifying the offset within
> the 16-byte AES block from which the encrypt/decrypt should proceed. It's the
> "num" parameter of their encrypt/decrypt functions. For a continuous encrypted
> stream such as a WAL file, a "pread(...)" of a possibly non-16-byte aligned
> section would involve determining the 16-byte counter (byte_offset / 16) and
> the intra-block offset (byte_offset % 16). I'm not sure how one handles
> initializing the internal encrypted counter and that might be one more step
> that would need be done. But it's definitely possible to read / write less than
> a block via those APIs (not the EVP ones).
> 
> I don't think the EVP functions have parameters for the intra-block offset but
> you can mimic it by initializing the IV/block counter and then skipping over
> the intra-block offset by either reading or writing a dummy partial block. The
> EVP read and write functions both deal with individual bytes so once you've
> seeked to your desired offset you can read or write the real individual bytes.

Can we generate the bit stream in 1MB chunks or something and just XOR
as needed?

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: crash 11.5~
Next
From: Justin Pryzby
Date:
Subject: Re: crash 11.5~ (and 11.4)