Re: [PATCH] buffile: ensure start offset is aligned with BLCKSZ - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: [PATCH] buffile: ensure start offset is aligned with BLCKSZ
Date
Msg-id 20253.1638180300@antos
Whole thread Raw
In response to [PATCH] buffile: ensure start offset is aligned with BLCKSZ  (Sasasu <i@sasa.su>)
Responses Re: [PATCH] buffile: ensure start offset is aligned with BLCKSZ
List pgsql-hackers
Sasasu <i@sasa.su> wrote:

> Hi hackers,
>
> there are a very long discuss about TDE, and we agreed on that if the
> temporary file I/O can be aligned to some fixed size, it will be easier
> to use some kind of encryption algorithm.
>
> discuss:
> https://www.postgresql.org/message-id/20211025155814.GD20998%40tamriel.snowman.net
>
> This patch adjust file->curOffset and file->pos before the real IO to
> ensure the start offset is aligned.

Does this test really pass regression tests? In BufFileRead(), I would
understand if you did

+            file->pos = offsetInBlock;
+            file->curOffset -= offsetInBlock;

rather than

+            file->pos += offsetInBlock;
+            file->curOffset -= offsetInBlock;

Anyway, BufFileDumpBuffer() does not seem to enforce curOffset to end up at
block boundary, not to mention BufFileSeek().

When I was implementing this for our fork [1], I concluded that the encryption
code path is too specific, so I left the existing code for the unecrypted data
and added separate functions for the encrypted data.

One specific thing is that if you encrypt and write n bytes, but only need
part of it later, you need to read and decrypt exactly those n bytes anyway,
otherwise the decryption won't work. So I decided not only to keep curOffset
at BLCKSZ boundary, but also to read / write BLCKSZ bytes at a time. This also
makes sense if the scope of the initialization vector (IV) is BLCKSZ bytes.

Another problem is that you might want to store the IV somewhere in between
the data. In short, the encryption makes the buffered IO rather different and
the specific code should be kept aside, although the same API is used to
invoke it.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

[1] https://github.com/cybertec-postgresql/postgres/tree/PG_14_TDE_1_1



pgsql-hackers by date:

Previous
From: "kuroda.hayato@fujitsu.com"
Date:
Subject: RE: [Proposal] Add foreign-server health checks infrastructure
Next
From: Amit Kapila
Date:
Subject: Re: row filtering for logical replication