On Tue, Jul 09, 2019 at 10:06:33PM -0400, Stephen Frost wrote:
>Greetings,
>
>* Ryan Lambert (ryan@rustprooflabs.com) wrote:
>> > What I think Tomas is getting at here is that we don't write a page only
>> > once.
>>
Yes, that's what I meant.
>> > A nonce of tableoid+pagenum will only be unique the first time we write
>> > out that page. Seems unlikely that we're only going to be writing these
>> > pages once though- what we need is a nonce that's unique for *every
>> > write* of the 8k page, isn't it? As every write of the page is going to
>> > be encrypting something new.
>>
>> > With sufficient randomness, we can at least be more likely to have a
>> > unique nonce for each 8K write. Including the LSN seems like it'd be a
>> > possible alternative.
>>
>> Agreed. I know little of the inner details about the LSN but what I read
>> in [1] sounds encouraging in addition to tableoid + pagenum.
>>
>> [1] https://www.postgresql.org/docs/current/datatype-pg-lsn.html
>
>Yes, but it's still something that we'd have to store somewhere- the
>actual LSN of the page is going to be in the 8K block.
>
>Unless we decide that we can pull the LSN *out* of the 8K block and
>store it unencrypted, and then store the *rest* of the block
>encrypted... That might also allow things like backup software to work
>on these encrypted data files for page-level backups without needing
>access to the key and that'd be pretty neat.
>
>Of course, as with anything, the more data you expose, the higher the
>overall risk that someone can figure out some meaning from it. Still,
>if the idea was that we'd use the LSN in this way, then it'd need to be
>stored unencrypted regardless...
>
Elsewhere in this thread I've already proposed to leave a bit of space at
the end of a page unencrypted, with page-level encryption metadata. That
might be the nonce (no matter how we end up computing it), key ID used to
encrypt this page, etc.
I don't think we need to put the whole LSN into the nonce in plaintext.
What I was imagining was intead using something like
sha2(LSN, oid, blockno, random())
or something like that.
Of course, having the LSN (and other stuff like page checksum) unencrypted
would be pretty useful - as you note.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services