Re: XTS cipher mode for cluster file encryption - Mailing list pgsql-hackers

From Sasasu
Subject Re: XTS cipher mode for cluster file encryption
Date
Msg-id d85f5865-1f5a-618e-7d92-92914a0f26b1@sasa.su
Whole thread Raw
In response to Re: XTS cipher mode for cluster file encryption  (Stephen Frost <sfrost@snowman.net>)
Responses Re: XTS cipher mode for cluster file encryption  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On 2021/10/20 20:24, Stephen Frost wrote:
 > PG does have a block-based IO API, it's just not exposed as hooks.  In
 > particular, take a look at md.c, though perhaps you'd be more interested
 > in the higher level bufmgr.c routines.  For the specific places where
 > encryption may hook in, looking at the DataChecksumsEnabled() call sites
 > may be informative (there aren't that many of them).

md.c is great, easy to understand. but PG does not have a unified API. 
There has many unexpected pread()/pwrite() in many corners. md.c only 
for heap table, bufmgr.c only for a buffered heap table.

eg: XLogWrite() looks like a block API, but is a range write. equivalent 
to the append(2)
eg: ALTER DATABASE SET TABLESPACE , the movedb() call. use copy_file() 
on heap table. which is just pread() pwrite() with 8*BLCKSZ.
eg: all front-end tools use pread() to read heap table. in particular, 
pg_rewind write heap table by offset.
eg: in contrib, pg_standby use system("cp") to copy WAL.

On 2021/10/20 20:24, Stephen Frost wrote:
 > Breaking our ondisk format explicitly means that pg_upgrade won't work
 > any longer and folks won't be able to do in-place upgrades. That's a
 > pretty huge deal and it's something we've not done in over a decade.
 > I doubt that's going to fly.

I completely agree.

On 2021/10/20 20:24, Stephen Frost wrote:
 > Yes, using another fork for this is something that's been considered but
 > it's not without its own drawbacks, in particular having to do another
 > write and later fsync when a page changes.
 >
 > Further, none of this is necessary for XTS, but only for GCM. This is
 > why it was put forward that GCM involves a lot more changes to the
 > system and means that we won't be able to do things like binary
 > replication to switch from an unencrypted to encrypted cluster. Those
 > are good reasons to consider an XTS implementation first and then later,
 > perhaps, implement GCM.

same as Robert Haas. I wish PG can do some infrastructure first. add 
more abstract layers like md.c (maybe a block-based API with ondisk 
format version field). so people can dive in without understanding the 
things which isolated by the abstract layer.

On 2021/10/20 20:24, Stephen Frost wrote:
 > What's the point of using GCM if we aren't going to actually verify the
 > tag? Also, the Cybertec patch didn't add an extra reserved field to the
 > page format, and it used CTR anyway..

Oh, I am wrong, Cybertec patch can not use XTS, because WAL may not be 
aligned to 16bytes. for WAL need a stream cipher. The CTR implement is 
still correct.

CTR with hash(offset) as IV is basically equal to XTS. if use another 
AES key to encrypt the hash(offset), and block size is 16bytes it is XTS.
The point is that can not save random IV for WAL without adding a 
reserved field, no matter use GCM or CTR.

Because WAL only does append to the end, using CTR for WAL is safer than 
using XTS for heap table. If you want more security for WAL encryption, 
add HKDF[1].

[1]: https://en.wikipedia.org/wiki/HKDF

Attachment

pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: lastOverflowedXid does not handle transaction ID wraparound
Next
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side