Re: HOT chain validation in verify_heapam() - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: HOT chain validation in verify_heapam() |
Date | |
Msg-id | 20221114220252.wahp6lp5hid5ty5k@awork3.anarazel.de Whole thread Raw |
In response to | Re: HOT chain validation in verify_heapam() (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: HOT chain validation in verify_heapam()
Re: HOT chain validation in verify_heapam() |
List | pgsql-hackers |
Hi, On 2022-11-14 14:27:54 -0500, Robert Haas wrote: > On Wed, Nov 9, 2022 at 5:08 PM Andres Freund <andres@anarazel.de> wrote: > > I don't really understand this logic - why can't we populate the predecessor > > array, if we can construct a successor entry? > > This whole thing was my idea, so let me try to explain. I think the > naming and comments need work, but I believe the fundamental idea may > be sound. > > successor[x] = y means that when we looked at line pointer x, we saw > that it was either a redirect to line pointer y, or else it had > storage and the associated tuple's CTID pointed to line pointer y. > At this point, we do not have any idea whether y is at all sane, nor we do > we know anything about which of x and y is larger. What do you mean with "larger" here? > Furthermore, it is > possible that successor[x] = successor[x'] since the page might be corrupted > and we haven't checked otherwise. > > predecessor[y] = x means that successor[x] = y but in addition we've > checked that y is sane, and that x.xmax=y.xmin. If there are multiple > tuples for which these conditions hold, we've issued complaints about > all but one and entered the last into the predecessor array. As shown by the isolationtester test I just posted, this doesn't quite work right now. Probably fixable. I don't think we can follow non-HOT ctid chains if they're older than the xmin horizon, including all cases of xmin being frozen. There's just nothing guaranteeing that the tuples are actually "related". It seems like we should do a bit more validation within a chain of tuples. E.g. that no live tuple can follow an !DidCommit xmin? > > I'm doubtful it's a good idea to try to validate the 9.4 case. The likelihood > > of getting that right seems low and I don't see us gaining much by even trying. > > I agree with Peter. We have to try to get that case right. If we can > eventually eliminate it as a valid case by some mechanism, hooray. But > in the meantime we have to deal with it as best we can. I now think that the 9.4 specific reasoning is bogus in the first place. The patch says: * Add a line pointer offset to the predecessor array if xmax is * matching with xmin of next tuple (reaching via its t_ctid). * Prior to PostgreSQL 9.4, we actually changed the xmin to * FrozenTransactionId so we must add offset to predecessor * array(irrespective of xmax-xmin matching) if updated tuple xmin * is frozen, so that we can later do validation related to frozen * xmin. Raise corruption if we have two tuples having the same * predecessor. but it's simply not correct to iterate through xmin=FrozenTransactionId - as shown in the isolationtester test. And that's unrelated to 9.4, because we couldn't rely on the raw xmin value either, because even if they match, they could be from different epochs. Greetings, Andres Freund
pgsql-hackers by date: