Re: "WIP: Data at rest encryption" patch and, PostgreSQL 11-beta3 - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: "WIP: Data at rest encryption" patch and, PostgreSQL 11-beta3 |
Date | |
Msg-id | 27465.1553168779@localhost Whole thread Raw |
In response to | Re: "WIP: Data at rest encryption" patch and, PostgreSQL 11-beta3 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: "WIP: Data at rest encryption" patch and, PostgreSQL 11-beta3
|
List | pgsql-hackers |
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > If the WAL *is* encrypted, the state at this point is that the block > > is unreadable, because the first 4kB of the block is the first half of > > the bits that resulted from encrypting 8kB of data that includes the > > new record, and the second 4kB of the block is the second half of the > > bits that resulted from encrypting 8kB of data that did not include > > the new record, and since the new record perturbed every bit on the > > page with probability ~50%, what that means is you now have garbage. > > That means that not only did we lose the new record, but we also lost > > the 3.5kB of good data which the page previously contained. That's > > NOT ok. Some of the changes associated with those WAL records may > > have been flushed to disk already, or there may be commits in there > > that were acknowledged to the client, and we can't just lose them. > > ISTM that this is only a problem if you choose the wrong encryption > method. One not-wrong encryption method is to use a stream cipher > --- maybe that's not the exact right technical term, but anyway > I'm talking about a method which notionally XOR's the cleartext > data with a random bit stream generated from the encryption key > (probably along with other knowable inputs such as the block number). > In such a method, corruption of individual on-disk bytes doesn't > prevent you from getting the correct decryption of on-disk bytes > that aren't corrupted. We actually use a block cipher (with block size 16 bytes), as opposed to stream cipher. It's true that partial write is a problem because if a single bit of the cipher text changed, decryption will produce 16 bytes of garbage. However I'm not sure if partial write can affect as small unit as 16 bytes. Nevertheless, with the current version of our patch, PG should be resistant against such a partial write anyway because we chose to align XLOG records to 16 bytes (as long as the encryption is enabled) for the following reasons: If one XLOG record ends and the following one starts in the same encryption block, both records can get corrupted during streaming replication. The scenario looks like: 1) the first record is written on master (the unused part of the block contains zeroes), 2) the block is encrypted and its initial part (i.e. the number of bytes occupied by the first record in the plain text) is streamed to slave, 3) the second record is written on master, 4) the containing encryption block is encrypted again and the trailing part (i.e. the number of bytes occupied by the second record) is streamed, 5) decryption of the block on slave will produce garbage and thus corrupt both records. This is because the trailing part of the block was filled with zeroes during encryption, but it contains different data at decryption time. Alternative approach to this replication problem is that walsender decrypts the stream and walreceiver encrypts it again. While this can provide us with the advantage to have master and slave encrypted with different keys, this approach brings some additional complexity. For example, pg_basebackup would need to deal with encryption. This design decision can be changed, but there's one more thing to consider: if the XLOG stream is decrypted, the decryption cannot be disabled unless the XLOG records are aligned to 16 bytes (and in turn, the XLOG alignment cannot be enabled w/o initdb). -- Antonin Houska https://www.cybertec-postgresql.com
pgsql-hackers by date: