Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS) - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Date
Msg-id CAD21AoA2WrgqDVxZxykihx7ovPfLabgeS2YyPTmsK30NqySnkg@mail.gmail.com
Whole thread Raw
In response to Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Bruce Momjian <bruce@momjian.us>)
Responses Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Masahiko Sawada <sawada.mshk@gmail.com>)
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Mon, Jul 29, 2019 at 11:33 AM Bruce Momjian <bruce@momjian.us> wrote:
>
> On Thu, Jul 25, 2019 at 01:03:06PM -0400, Bruce Momjian wrote:
> > On Tue, Jul 16, 2019 at 01:24:54PM +0900, Masahiko Sawada wrote:
> > > On Sat, Jul 13, 2019 at 12:33 AM Bruce Momjian <bruce@momjian.us> wrote:
> > > > then each row change gets its own LSN.  You are asking if an update that
> > > > just expires one row and adds it to a new page gets the same LSN.  I
> > > > don't know.
> > >
> > > The following scripts can reproduce that different two pages have the same LSN.
> > >
> > > =# create table test (a int);
> > > CREATE TABLE
> > > =# insert into test select generate_series(1, 226);
> > > INSERT 0 226
> > > =# update test set a = a where a = 1;
> > > UPDATE 1
> > > =# select lsn from page_header(get_raw_page('test', 0));
> > >     lsn
> > > -----------
> > >  0/1690488
> > > (1 row)
> > >
> > > =# select lsn from page_header(get_raw_page('test', 1));
> > >     lsn
> > > -----------
> > >  0/1690488
> > > (1 row)
> > >
> > > So I think it's better to use LSN and page number to create IV. If we
> > > modify different tables by single WAL we also would need OID or
> > > relfilenode but I don't think currently we have such operations.
> >
> > OK, good to know, thanks.
>
> I did some more research on which cases use a single LSN to modify
> multiple 8k pages.  The normal program flow is:
>
>         XLogBeginInsert();
>         ...
>         XLogRegisterBuffer(0, meta, ...
> -->     recptr = XLogInsert(RM_BRIN_ID, XLOG_...
>
>         page = BufferGetPage(meta);
>         PageSetLSN(page, recptr);
>
> XLogInsert() calls BufferGetTag(), which fills in the buffer's
> RelFileNode (which internally is the tablespace, database, and
> pg_class.relfilenode).  So, to use the LSN and page-number for the IV,
> we need to make sure that there is no other encryption of those values
> in a different relation.  What I did was to find cases where
> XLogRegisterBuffer/PageSetLSN are called more than once for a single
> LSN.  I found cases in:
>
>         brin_doupdate
>         brin_doinsert
>         brinRevmapDesummarizeRange
>         revmap_physical_extend
>         GenericXLogFinish
>         ginPlaceToPage
>         shiftList
>         ginDeletePage
>         gistXLogSplit
>         gistXLogPageDelete
>         gistXLogUpdate
>         hashbucketcleanup
>         _hash_doinsert
>         _hash_vacuum_one_page
>         _hash_addovflpage
>         _hash_freeovflpage
>         _hash_squeezebucket
>         _hash_init
>         _hash_expandtable
>         _hash_splitbucket
>         log_heap_visible
>         log_heap_update
>         _bt_insertonpg
>         _bt_split
>         _bt_newroot
>         _bt_getroot
>         _bt_mark_page_halfdead
>         _bt_unlink_halfdead_page
>         addLeafTuple
>         moveLeafs
>         doPickSplit
>         spgSplitNodeAction
>         log_newpage_range
>
> Most of these are either updating the different pages in the same
> relation (so the page-number for the IV would be different), or are
> modifying other types of files, like vm.  (We have not discussed if we
> are going to encrypt vm or fsm.  I am guessing we are not.)
>
> You might say, well, is it terrible if we reuse the LSN in a different
> relation with the same page number?  Yes.  The way CTR works, it
> generates a stream of bits using the key and IV (which will be LSN and
> page number).  It then XORs it with the page contents to encrypt it.  If
> we encrypt an all-zero gap in a page, or a place in the page where the
> page format is known, a user can XOR that with the encrypted data and
> get the bit stream at that point.  They can then go to another page that
> uses the same key and IV and XOR that to get the decrypted data.  CBC
> mode is slightly better because it mixes the user data into the future
> 16-byte blocks, but lots of our early-byte page format is known, so it
> isn't great.
>
> You might say, wow, that is a lot of places to make sure we don't reuse
> the LSN in a different relation with the same page number --- let's mix
> the relfilenode in the IV so we are sure the IV is not reused.
>
> Well, the pg_class.relfilenode is only unique within the
> tablespace/database, i.e., from src/include/storage/relfilenode.h:
>
>  * relNode identifies the specific relation.  relNode corresponds to
>  * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
>  * to assign new physical files to relations in some situations).
>  * Notice that relNode is only unique within a database in a particular
>  * tablespace.
>
> So, we would need to mix the tablespace, database oid, and relfilenode
> into the IV to be unique.  We would then need to have pg_upgrade
> preserve the relfilenode, change CREATE DATABASE to decrypt/encrypt when
> creating a new database, and no longer allow files to be moved between
> tablespaces without decryption/encryption.
>
> There are just a whole host of complexities we add to encryption if we
> add the requirement of preserving the refilenode, tablespace, and
> database to decrypt each page.  I just don't think we want go there
> unless we have a valid reason.
>
> I am thinking of writing some Assert() code that checks that all buffers
> using a single LSN are from the same relation (and therefore different
> page numbers).  I would do it by creating a static array, clearing it on
> XLogBeginInsert(), adding to it for each  XLogInsert(), then checking on
> PageSetLSN() that everything in the array is from the same file.  Does
> that make sense?

I had the same concern before. We could have BKPBLOCK_SAME_REL flag in
XLogRecordBlockHeader, which indicates that the relation of the block
is the same as the previous block and therefore we skip to write
RelFileNode. At first glance I thought it's possible that one WAL
record can contain different RelFileNodes but I didn't find any code
attempting to do that.

Checking that all buffers using a single LSN are from the same
relation would be a good idea but I think it's hard to test it and
regard the test result as okay. Even if we passed 'make checkworld',
it might still be possible to happen. And even assertion failures
don't happen in production environment. So I guess it would be better
to have IV so that we never reuse in different relation with the same
page. An idea I came up with is that we make  IV from (PageLSN,
PageNumber, relNode) and have the encryption keys per tablespace.
That way, we never reuse IV in a different relation with the same page
number because relNode is unique within a database in a particular
tablespace as you mentioned.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Sehrope Sarkuni
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Next
From: Masahiko Sawada
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)