Re: XTS cipher mode for cluster file encryption - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: XTS cipher mode for cluster file encryption |
Date | |
Msg-id | 20211020122407.GW20998@tamriel.snowman.net Whole thread Raw |
In response to | Re: XTS cipher mode for cluster file encryption (Sasasu <i@sasa.su>) |
Responses |
Re: XTS cipher mode for cluster file encryption
|
List | pgsql-hackers |
Greetings, * Sasasu (i@sasa.su) wrote: > But If PG has a clear block-based IO API, TDE is much easier to understand. PG does have a block-based IO API, it's just not exposed as hooks. In particular, take a look at md.c, though perhaps you'd be more interested in the higher level bufmgr.c routines. For the specific places where encryption may hook in, looking at the DataChecksumsEnabled() call sites may be informative (there aren't that many of them). > security people may lack database knowledge but they can understand block > IO. > This will allow more people to join PG community. We'd certainly welcome them. I don't think we're going to try to redesign our entire IO subsystem in the hopes that they'll show up though. > On 2021/10/20 02:54, Stephen Frost wrote: > > Where would you store the tag for GCM without changes in core? > > If can add 32bit reserve field (in CGM is 28bits) will be best. That's the idea that's been discussed, but the approach put forward is to do it in a manner that allows the same binaries to work with a TDE-enabled cluster and a non-TDE cluster which means two different formats on disk. This is still a pretty big deal and would require logical replication or pg_dump/restore to go from unencrypted to encrypted. > data file size will increase 0.048% (if BLCKSZ = 8KiB), I think it is > acceptable even for the user who does not use TDE. but need ondisk format > change. Breaking our ondisk format explicitly means that pg_upgrade won't work any longer and folks won't be able to do in-place upgrades. That's a pretty huge deal and it's something we've not done in over a decade. I doubt that's going to fly. > If without of modify anything in core and doing GCM, the under-layer can > write out a key fork, fsync(2) key fork with the same strategy for main > fork. this is crash-safe. The consistency is ensured by WAL. (means > wal_log_hints need set to on) > Or the underlayer can re-struct the IO request. insert one IV block per > 2730(=BLKSZ/IV_SIZE) data blocks. this method like the _mdfd_getseg() in > md.c which split file by 1GiB. No perception in the upper layers. > Both of them can use cache to reduce performance downgrade. Yes, using another fork for this is something that's been considered but it's not without its own drawbacks, in particular having to do another write and later fsync when a page changes. Further, none of this is necessary for XTS, but only for GCM. This is why it was put forward that GCM involves a lot more changes to the system and means that we won't be able to do things like binary replication to switch from an unencrypted to encrypted cluster. Those are good reasons to consider an XTS implementation first and then later, perhaps, implement GCM. > for WAL encryption, the CybertecDB implement is correct. we can not write > any extra data without adding a reserved field in core. because can not > guarantee consistency. If use GCM for WAL encryption must disable HMAC > verification. What's the point of using GCM if we aren't going to actually verify the tag? Also, the Cybertec patch didn't add an extra reserved field to the page format, and it used CTR anyway.. Thanks, Stephen
Attachment
pgsql-hackers by date: