Re: better page-level checksums - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: better page-level checksums |
Date | |
Msg-id | CA+TgmoZ+DcjmvCm6_Su0ECYui5ZdJz0wHFp2QMJ9ORjE3vkxsQ@mail.gmail.com Whole thread Raw |
In response to | Re: better page-level checksums (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: better page-level checksums
|
List | pgsql-hackers |
On Fri, Jun 10, 2022 at 12:08 PM Stephen Frost <sfrost@snowman.net> wrote: > So, it's not quite as simple as use X or use Y, we need to be > considering the use case too. In particular, the amount of data that's > being hash'd is relevant when it comes to making a decision about what > hash or checksum to use. When you're talking about (potentially) 1G > segment files, you'll want to use something different (like SHA) vs. > when you're talking about an 8K block (not that you couldn't use SHA, > but it may very well be overkill for it). Interesting. I expected you to be cheerleading for SHA like a madman. > In terms of TDE, that's yet a different use-case and you'd want to use > AE (authenticated encryption) + AAD (additional authenticated data) and > the result of that operation is a block which has some amount of > unencrypted data (eg: LSN, potentially used as the IV), some amount of > encrypted data (eg: everything else), and then space to store the tag > (which can be thought of, but is *distinct* from, a hash of the > encrypted data + the additional unencrypted data, where the latter would > include the unencrypted data on the block, like the LSN, plus other > information that we want to include like the qualified path+filename of > the file as relevant to the PGDATA root). If our goal is > cryptographiclly authenticated and encrypted data pages (which I believe > is at least one of our goals) then we're talking about encryption > methods like AES GCM which handle production of the tag for us and with > that tag we would *not* need to have any independent hash or checksum for > the block (though we could, but that should really be included in the > *encrypted* section, as hashing unencrypted data and then storing that > hash unencrypted could potentially leak information that we'd rather > not). Yeah, and I feel there was discussion of how much space AES-GCM-SIV would need per page and I can't find that discussion now. Pretty sure it was a pretty meaty number of bytes, and I assume it's also not that cheap to compute. > Note that NIST has put out information regarding how big a tag is > appropriate for how much data is being encrypted with a given > authenticated encryption method such as AES GCM. I recall Robert > finding similar information for hashing/checksumming of unencrypted > data from a similar source and that'd make sense to consider when > talking about *just* adding a hash/checksum for unencrypted data blocks. > > This is the relevant discussion from NIST on this subject: > > https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpublication800-38d.pdf > > Note particularly Appendix C: Requirements and Guidelines for Using > Short Tags (though, really, the whole thing is good to read..). I don't see that as very relevant. That's talking about using 32-bit or 64-bit tags for things like VOIP packets where a single compromised packet wouldn't reveal a whole lot. I think we have to take the view that a single compromised disk block is a big deal. > In the thread about checksum/hashes for the backup manifest, I was > pretty sure you found some information regarding the amount of data > being hashed vs. the size you want the hash/checksum to be and that > seems like it'd be particularly relevant for this discussion (as it was > for backups, at least as I recall..). Hopefully we can go find that. I went back and looked and found that I had written this: https://www.postgresql.org/message-id/CA+TgmoYOKC_8o-AR1jTQs0mOrFx=_Rcy5udor1m-LjyJNiSWPQ@mail.gmail.com I think that gets us a whole lot of nowhere, honestly. I think this email from Andres is more on point: http://postgr.es/m/20200327195624.xthhd4xuwabvd3ou@alap3.anarazel.de I don't really understand all the details of the smhasher pages to which he links, but I understand that they're measuring the quality of the bit-mixing, which does matter. So does speed, because people like their database to be fast even if it's using checksums (or TDE if we had that). And I think the output size is also a relevant consideration, because more output bits means both more chance of detecting errors (assuming good bit-mixing, at least) and also more wasted space that isn't being used to store your actual data. I haven't really seen anything that makes me believe that there's a particularly strong relationship between block size and ideal checksum size. There's some relationship, for sure. For instance, you want the checksum to be small relative to the size of the input, so as not to waste a whole bunch of storage space. I wouldn't propose to hash 32 byte blocks with SHA-256, because my checksums would be as big as the original data. But I don't really think such considerations are relevant here. An 8kB block is big enough that any checksum algorithm in common use today is going to produce output that is well under 1% of the page size, so you're not going to be wasting tons of storage. You might be wasting your time, though. One big knock on the Fletcher-16 approach we're using today is that the chance of a chance hash collision is pretty noticeably more than 0. Generally, I think we're right to think that's acceptable, because your chances of noticing even a single corrupted block are very high. However, if you're operating tens or hundreds of thousands of PostgreSQL clusters containing terabytes or petabytes of data, it's quite likely that there will be instances of corruption which you fail to detect because the checksum collided. Maybe you care about that. If you do, you probably need at least a 64-bit checksum before the risk of missing even a single instance of corruption due to a checksum collision becomes negligible. Maybe even slightly larger if the amount of data you're managing is truly vast. So I think there's probably a good argument that if you're just concerned about detecting corruption due to bugs, operator error, hardware failure, etc., something like a 512-bit checksum is overkill if the only purpose is to detect random bit flips. I think the time when you need more bits is when you have some goal beyond being really likely to detect a random error - e.g. if you want 100% guaranteed detection of every single-bit error, or if you want error correction, or if you want to foil an adversary who is trying to construct checksums for maliciously modified blocks. But that is also true for the backup manifest case, and we allowed SHA-n as an option there. I feel like there are bound to be some people who want something like SHA-n just because it sounds good, regardless of whether they really need it. We can tell them "no," though. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: