Re: better page-level checksums - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: better page-level checksums |
Date | |
Msg-id | CA+Tgmobu2sUKDCiYKtgs-6XeGzaXaQR3DXgf1AB=suZpGCHnNQ@mail.gmail.com Whole thread Raw |
In response to | Re: better page-level checksums (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: better page-level checksums
Re: better page-level checksums |
List | pgsql-hackers |
On Tue, Jun 14, 2022 at 11:08 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > I agree with the premise of one only needing one such blob on the > page, yet I don't think that putting it on the exact end of the page > is the best option. > > PageGetSpecialPointer is much simpler when you can rely on the > location of the special area. As special areas can be accessed N times > each time a buffer is loaded from disk, and yet the 'storage system > extra blob' only twice (once read, once write), I think the special > area should have priority when handing out page space. Hmm, but on the other hand, if you imagine a scenario in which the "storage system extra blob" is actually a nonce for TDE, you need to be able to find it before you've decrypted the rest of the page. If pd_checksum gives you the offset of that data, you need to exclude it from what gets encrypted, which means that you need encrypt three separate non-contiguous areas of the page whose combined size is unlikely to be a multiple of the encryption algorithm's block size. That kind of sucks (and putting it at the end of the page makes it way better). That said, I certainly agree that finding the special space needs to be fast. The question in my mind is HOW fast it needs to be, and what techniques we might be able to use to dodge the problem. For instance, suppose that, during the startup sequence, we look at the control file, figure out the size of the 'storage system extra blob', and based on that each AM figures out the byte-offset of its special space and caches that in a global variable. Then, instead of PageGetSpecialSpace(page) it does PageGetBtreeSpecialSpace(page) or whatever, where the implementation is ((char*) page) + the_afformentioned_global_variable. Is that going to be too slow? If it is, then I think this whole effort may be in more trouble than I can get it out of, because it's not just the location of the special space that is an issue here, and indeed from what I can see that's not even the most important issue. There's tons of constants that are computed based on the amount of usable space in the page, and I don't have a better idea than turning those constants into global variables that are computed once ... well, perhaps in some cases we could multiply compile hot bits of code, once per possible value of the compile-time constant, but I'm pretty sure we don't want to do that for the entire index AM. There's going to have to be some compromise here. On the one hand you're going to have people who want to be able to do run-time conversions between page formats even at the cost of extra runtime overhead on top of what the basic feature necessarily implies. On the other hand you're going to have people who don't think any overhead at all is acceptable, even if it's purely nominal and only visible on a microbenchmark. Such arguments can easily become holy wars. I think we should take a pragmatic approach: big slowdowns are categorically unacceptable, and every effort must be made to minimize overhead, but if the only permissible amount of overhead is exactly zero, then there's no hope of ever implementing any of these kinds of features. I don't think that's actually what most people want. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: