Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Date
Msg-id CAD21AoCvoBhJVXP-CBX7v+au-825bBw25PdPLkdd6Wp8D80f2g@mail.gmail.com
Whole thread Raw
In response to Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Bruce Momjian <bruce@momjian.us>)
Responses Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
List pgsql-hackers
On Tue, Aug 6, 2019 at 9:42 AM Bruce Momjian <bruce@momjian.us> wrote:
>
> On Wed, Jul 31, 2019 at 04:58:49PM +0900, Masahiko Sawada wrote:
> > On Wed, Jul 31, 2019 at 3:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > >
> > > For WAL encryption,  before flushing WAL we encrypt whole 8k WAL page
> > > and then write only the encrypted data of the new WAL record using
> > > pg_pwrite() rather than write whole encrypted page. So each time we
> > > encrypt 8k WAL page we end up with encrypting different data with the
> > > same key+nonce but since we don't write to the disk other than space
> > > where we actually wrote WAL records it's not a problem. Is that right?
> >
> > Hmm that's incorrect. We always write an entire 8k WAL page even if we
> > write a few WAl records into a page. It's bad because we encrypt
> > different pages with the same key+IV, but we cannot change IV for each
> > WAL writes as we end up with changing also
> > already-flushed-WAL-records. So we might need to change the WAL write
> > so that it write only WAL records we actually wrote.
>
> Uh, I don't understand.  We use the LSN to write the 8k page, and we use
> a different nonce scheme for the WAL.  The LSN changes each time the
> page is modified. The 8k page in the WAL is encrypted just like the rest
> of the WAL.

What I'm thinking about WAL encryption is that WAL records on WAL
buffer is not encrypted. When writing to the disk we copy the contents
of 8k WAL page to a temporary buffer and encrypt it, and then write
it. And according to the current behavior, every time we write WAL we
write WAL per 8k WAL pages rather than WAL records.

The nonce for WAL encryption is {segment number, counter}. Suppose we
write 100 bytes WAL at beginning of the first 8k WAL page in WAL
segment 50. We encrypt the entire 8k WAL page with the nonce starting
from {50, 0} and write to the disk. After that, suppose we append 200
bytes WAL to the same WAL page. We again encrypt the entire 8k WAL
page with the nonce staring from {50, 0} and write to the disk. The
two 8k WAL pages we wrote to the disk are different but we encrypted
them with the same nonce, which I think it's bad.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Problem with default partition pruning
Next
From: Peter Geoghegan
Date:
Subject: Re: pg can create duplicated index without any errors even warnning