Thread: [PATCH] buffile: ensure start offset is aligned with BLCKSZ

[PATCH] buffile: ensure start offset is aligned with BLCKSZ

From
Sasasu
Date:
Hi hackers,

there are a very long discuss about TDE, and we agreed on that if the
temporary file I/O can be aligned to some fixed size, it will be easier
to use some kind of encryption algorithm.

discuss:
https://www.postgresql.org/message-id/20211025155814.GD20998%40tamriel.snowman.net

This patch adjust file->curOffset and file->pos before the real IO to
ensure the start offset is aligned.
Attachment

Re: [PATCH] buffile: ensure start offset is aligned with BLCKSZ

From
Antonin Houska
Date:
Sasasu <i@sasa.su> wrote:

> Hi hackers,
>
> there are a very long discuss about TDE, and we agreed on that if the
> temporary file I/O can be aligned to some fixed size, it will be easier
> to use some kind of encryption algorithm.
>
> discuss:
> https://www.postgresql.org/message-id/20211025155814.GD20998%40tamriel.snowman.net
>
> This patch adjust file->curOffset and file->pos before the real IO to
> ensure the start offset is aligned.

Does this test really pass regression tests? In BufFileRead(), I would
understand if you did

+            file->pos = offsetInBlock;
+            file->curOffset -= offsetInBlock;

rather than

+            file->pos += offsetInBlock;
+            file->curOffset -= offsetInBlock;

Anyway, BufFileDumpBuffer() does not seem to enforce curOffset to end up at
block boundary, not to mention BufFileSeek().

When I was implementing this for our fork [1], I concluded that the encryption
code path is too specific, so I left the existing code for the unecrypted data
and added separate functions for the encrypted data.

One specific thing is that if you encrypt and write n bytes, but only need
part of it later, you need to read and decrypt exactly those n bytes anyway,
otherwise the decryption won't work. So I decided not only to keep curOffset
at BLCKSZ boundary, but also to read / write BLCKSZ bytes at a time. This also
makes sense if the scope of the initialization vector (IV) is BLCKSZ bytes.

Another problem is that you might want to store the IV somewhere in between
the data. In short, the encryption makes the buffered IO rather different and
the specific code should be kept aside, although the same API is used to
invoke it.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

[1] https://github.com/cybertec-postgresql/postgres/tree/PG_14_TDE_1_1



Re: [PATCH] buffile: ensure start offset is aligned with BLCKSZ

From
Sasasu
Date:
On 2021/11/29 18:05, Antonin Houska wrote:
> Does this test really pass regression tests? In BufFileRead(), I would
> understand if you did
>
> +            file->pos = offsetInBlock;
> +            file->curOffset -= offsetInBlock;
>
> rather than
>
> +            file->pos += offsetInBlock;
> +            file->curOffset -= offsetInBlock;

It pass all regression tests. this patch is compatible with
BufFileSeek().

to generate a correct alignment, we need to make sure
  pos_new + offset_new = pos_old + offset_old
            offset_new = offset_old - offset_old % BLCKSZ
it means
  pos_new = pos_old + offset_old % BLCKSZ
          = pos_old + "offsetInBlock"

with your code, backend will read a wrong buffile at the end of buffile
reading. for example: physical file size = 20 and pos = 10, off = 10,
read start at 20. after the '=' code: pos = 10, off = 0, read start at
10, which is wrong.

> Anyway, BufFileDumpBuffer() does not seem to enforce curOffset to end up at
> block boundary, not to mention BufFileSeek().
>
> When I was implementing this for our fork [1], I concluded that the encryption
> code path is too specific, so I left the existing code for the unecrypted data
> and added separate functions for the encrypted data.
>
> One specific thing is that if you encrypt and write n bytes, but only need
> part of it later, you need to read and decrypt exactly those n bytes anyway,
> otherwise the decryption won't work. So I decided not only to keep curOffset
> at BLCKSZ boundary, but also to read / write BLCKSZ bytes at a time. This also
> makes sense if the scope of the initialization vector (IV) is BLCKSZ bytes.
>
> Another problem is that you might want to store the IV somewhere in between
> the data. In short, the encryption makes the buffered IO rather different and
> the specific code should be kept aside, although the same API is used to
> invoke it.
>
but I want to make less change on existed code. with this path. the only
code added to critical code path is this:

diff --git a/src/backend/storage/file/buffile.c b/src/backend/storage/file/buffile.c
index 3be08eb723..ceae85584b 100644
--- a/src/backend/storage/file/buffile.c
+++ b/src/backend/storage/file/buffile.c
@@ -512,6 +512,9 @@ BufFileDumpBuffer(BufFile *file)
         /* and the buffer is aligned with BLCKSZ */
         Assert(file->curOffset % BLCKSZ == 0);

+        /* encrypt before write */
+        TBD_ENC(file->buffer.data + wpos /* buffer */, bytestowrite /* size */, file->curOffset /* context to find IV
*/);
+
         thisfile = file->files[file->curFile];
         bytestowrite = FileWrite(thisfile,
                                  file->buffer.data + wpos,
@@ -582,6 +585,9 @@ BufFileRead(BufFile *file, void *ptr, size_t size)
             BufFileLoadBuffer(file);
             if (file->nbytes <= 0 || (file->nbytes == file->pos && file->nbytes != BLCKSZ))
                 break;            /* no more data available */
+
+            /* decrypt after read */
+            TBD_DEC(file->buffer /* buffer */, file->nbytes /* size */, file->curOffset /* context to find IV */);
         }

         nthistime = file->nbytes - file->pos;

those change will allow TDE to use any encryption algorithm (read offset
and write offset are matched) and implement on-the-fly IV generation.

Attachment