Re: HOT chain validation in verify_heapam() - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: HOT chain validation in verify_heapam() |
Date | |
Msg-id | CA+TgmoYuOHg8ijjYZk7VtvxyYr0Sp0ZeJRguoDOsFhhWnxuHdA@mail.gmail.com Whole thread Raw |
In response to | Re: HOT chain validation in verify_heapam() (Aleksander Alekseev <aleksander@timescale.com>) |
Responses |
Re: HOT chain validation in verify_heapam()
|
List | pgsql-hackers |
On Thu, Jan 19, 2023 at 8:55 AM Aleksander Alekseev <aleksander@timescale.com> wrote: > I noticed that this patch stuck a little and decided to take another look. > > It seems to be well written, covered with tests and my understanding > is that all the previous feedback was accounted for. To your knowledge > is there anything that prevents us from moving it to "Ready for > Committer"? Thanks for taking a look, and for pinging the thread. I think that the handling of lp_valid[] in the loop that begins with "Loop over offset and populate predecessor array from all entries that are present in successor array" is very confusing. I think that lp_valid[] should be answering the question "is the line pointer basically sane?". That is, if it's a redirect, it needs to point to something within the line pointer array (and we also check that it must be an entry in the line pointer array that is used, which seems fine). If it's not a redirect, it needs to point to space that's entirely within the block, properly aligned, and big enough to contain a tuple. We determine the answers to all of these questions in the first loop, the one that starts with /* Perform tuple checks */. Nothing that happens in the second loop, where we populate the predecessor array, can reverse our previous conclusion that the line pointer is valid, so this loop shouldn't be resetting entries in lp_valid[] to false. The reason that it's doing so seems to be that it wants to use lp_valid[] to control the behavior of the third loop, where we perform checks against things that have entries in the predecessor array. As written, the code ensures that we always set lp_valid[nextoffnum] to false unless we set predecessor[nextoffnum] to a value other than InvalidOffsetNumber. But that is needlessly complex: the third loop doesn't need to look at lp_valid[] at all. It can just check whether predecessor[currentoffnum] is valid. If it is, perform checks. Otherwise, skip it. It seems to me that this would be significantly simpler. To put the above complaint another way, a variable shouldn't mean two different things depending on where you are in the function. Right now, at the end of the first loop, lp_valid[x] answers the question "is line pointer x basically valid?". But by the end of the second loop, it answers the question "is line pointer x valid and does it also have a valid predecessor?". That kind of definitional change is something to be avoided. The test if (pred_in_progress || TransactionIdDidCommit(curr_xmin)) seems wrong to me. Shouldn't it be &&? Has this code been tested at all? It doesn't seem to have a test case. Some of these other errors don't, either. Maybe there's some that we can't easily test in an automated way, but we should test what we can. I guess maybe casual testing wouldn't reveal the problem here because of the recheck, but it's worrying to find logic that doesn't look right with no corresponding comments or test cases. Some error message kibitizing: psprintf("redirected tuple at line pointer offset %u is not heap only tuple", It seems to me that this should say "redirected line pointer pointing to a non-heap-only tuple at offset %u". There is no such thing as a redirected tuple -- and even if there were, what we have here is clearly a redirected line pointer. psprintf("redirected tuple at line pointer offset %u is not heap only tuple", And I think for the same reasons this one should say something like "redirected line pointer pointing to a non-heap-only tuple at offset %u". psprintf("redirected tuple at line pointer offset %u is not heap updated tuple", Possibly all of these would sound better with "points" rather than "pointing" -- if so, we'd need to change an existing message in the same way. And this one should say something like "redirected line pointer pointing to a tuple not produced by an update at offset %u". psprintf("tuple is root of chain but it is marked as heap-only tuple")); I think this would sound better if you deleted the word "it". I don't know whether it's worth arguing about -- it feels like we've argued too much already about this sort of thing -- but I am not very convinced by initializers like OffsetNumber predecessor[MaxOffsetNumber] = {InvalidOffsetNumber}. That style is only correct because InvalidOffsetNumber happens to be zero. If it were up to me, I'd use memset to clear the predecessor array. I would not bulk initialize sucessor and lp_valid but make sure that the first loop always sets them, possibly by having the top of the loop set them to InvalidOffsetNumber and false initially and then letting code later in the loop change the value, or possibly in some other way. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: