Re: Key management with tests - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Key management with tests
Date
Msg-id 20210112193431.2edcz776qjen7kao@alap3.anarazel.de
Whole thread Raw
In response to Re: Key management with tests  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Hi,

On 2021-01-11 20:12:00 +0900, Masahiko Sawada wrote:

> diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
> index 32b5d62e1f..d474af753c 100644
> --- a/contrib/bloom/blinsert.c
> +++ b/contrib/bloom/blinsert.c
> @@ -177,6 +177,7 @@ blbuildempty(Relation index)
>       * XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record.  Therefore, we need
>       * this even when wal_level=minimal.
>       */
> +    PageEncryptInplace(metapage, INIT_FORKNUM, BLOOM_METAPAGE_BLKNO);
>      PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
>      smgrwrite(index->rd_smgr, INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
>                (char *) metapage, true);

There's quite a few places doing encryption + checksum + smgwrite now. I
strongly suggest splitting that off into a helper routine in a
preparatory patch.


> @@ -528,6 +529,8 @@ BootstrapModeMain(void)
>  
>      InitPostgres(NULL, InvalidOid, NULL, InvalidOid, NULL, false);
>  
> +    InitializeBufferEncryption();
> +
>      /* Initialize stuff for bootstrap-file processing */
>      for (i = 0; i < MAXATTR; i++)
>      {

Why are we initializing this here instead of postmaster? As far as I can
tell that just leads to redundant work instead of doing it once?


> +/*-------------------------------------------------------------------------
> + * We use both page LSN and page number to create a nonce for each page. Page
> + * LSN is 8 byte, page number is 4 byte, and the maximum required counter for
> + * AES-CTR is 2048, which fits in 3 byte. Since the length of IV is 16 byte
> + * it's fine. Using the LSN and page number as part of the nonce has
> + * three benefits:
> + *
> + * 1. We don't need to decrypt/re-encrypt during CREATE DATABASE since the page
> + * contents are the same in both places, and once one database changes its pages,
> + * it gets a new LSN, and hence a new nonce.
> + * 2. For each change of an 8k page, we get a new nonce, so we are not encrypting
> + * different data with the same nonce/IV.
> + * 3. We avoid requiring pg_upgrade to preserve database oids, tablespace oids,
> + * relfilenodes.

I think 3) also has a few minor downsides - by not including information
identifying a relation a potential attacker with access to the data
directory has more chances to get the database to decrypt data by
e.g. switching relation files around.



> @@ -2792,12 +2793,15 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
>       */
>      bufBlock = BufHdrGetBlock(buf);
>  
> +    bufToWrite = PageEncryptCopy((Page) bufBlock, buf->tag.forkNum,
> +                                 buf->tag.blockNum);
> +
>      /*
>       * Update page checksum if desired.  Since we have only shared lock on the
>       * buffer, other processes might be updating hint bits in it, so we must
>       * copy the page to private storage if we do checksumming.
>       */
> -    bufToWrite = PageSetChecksumCopy((Page) bufBlock, buf->tag.blockNum);
> +    bufToWrite = PageSetChecksumCopy((Page) bufToWrite, buf->tag.blockNum);
>  
>      if (track_io_timing)
>          INSTR_TIME_SET_CURRENT(io_start);

So now we copy the page twice, not just once, if both checksums and
encryption is enabled? That doesn't seem right.


> @@ -3677,6 +3683,21 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
>          {
>              dirtied = true;        /* Means "will be dirtied by this action" */
>  
> +            /*
> +             * We will dirty the page but the page lsn is not changed if we
> +             * doesn't write a backup block. We don't want to encrypt the
> +             * different bits stream with the same combination of nonce and key
> +             * since in buffer encryption the page lsn is a part of nonce.
> +             * Therefore we WAL-log no-op record just to move page lsn forward if
> +             * we doesn't write a backup block, even when this is not the first
> +             * modification in this checkpoint round.
> +             */
> +            if (XLogRecPtrIsInvalid(lsn) && DataEncryptionEnabled())
> +            {
> +                lsn = log_noop();
> +                Assert(!XLogRecPtrIsInvalid(lsn));
> +            }
> +

Aren't you doing a WAL record while holding the buffer header lock here?
You can't do things like WAL insertions while holding a spinlock.


I don't see how it is safe / correct to use a noop record here. A noop
record isn't associated with the page, so WAL replay isn't going to
perform the same LSN modification.

Also, why is it OK to modify the LSN without, if necessary, logging an FPI?



> +char *
> +PageEncryptCopy(Page page, ForkNumber forknum, BlockNumber blkno)
> +{
> +    static char *pageCopy = NULL;
> +
> +    /* If we don't need a checksum, just return the passed-in data */
> +    if (PageIsNew(page) || !PageNeedsToBeEncrypted(forknum))
> +        return (char *) page;

Why is it OK to not encrypt new pages?


> +#define PageEncryptOffset    offsetof(PageHeaderData, pd_special)
> +#define SizeOfPageEncryption (BLCKSZ - PageEncryptOffset)

I think you need a detailed explanation somewhere about what you're
doing here, and why it's a good idea.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Key management with tests
Next
From: Tom Lane
Date:
Subject: pg_dump INDEX ATTACH versus --clean option