Re: pg_amcheck contrib application - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: pg_amcheck contrib application |
Date | |
Msg-id | CA+TgmobA_x0A1jh+V=vD1mHB+pzFaR2wTzng4yoKzgr_hvNOcg@mail.gmail.com Whole thread Raw |
In response to | Re: pg_amcheck contrib application (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: pg_amcheck contrib application
|
List | pgsql-hackers |
On Wed, Mar 24, 2021 at 9:12 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Mar 24, 2021 at 2:13 AM Mark Dilger > <mark.dilger@enterprisedb.com> wrote: > > The visibility rules fix is different in v11, relying on a visibility check which more closely follows the implementationof HeapTupleSatisfiesVacuumHorizon. > > Hmm. The header comment you wrote says "If a tuple might not be > visible to any running transaction, then we must not check it." But, I > don't find that statement very clear: does it mean "if there could be > even one transaction to which this tuple is not visible, we must not > check it"? Or does it mean "if the number of transactions that can see > this tuple could potentially be zero, then we must not check it"? I > don't think either of those is actually what we care about. I think > what we should be saying is "if the tuple could have been inserted by > a transaction that also added a column to the table, but which > ultimately did not commit, then the table's current TupleDesc might > differ from the one used to construct this tuple, so we must not check > it." Hit send too soon. And I was wrong, too. Wahoo. Thinking about the buildfarm failure, I realized that there's a second danger here, unrelated to the possibility of different TupleDescs, which we talked about before: if the tuple is dead, we can't safely follow any TOAST pointers, because the TOAST chunks might disappear at any time. So technically we could split the return value up into something three-way: if the inserted is known to have committed, we can check the tuple itself, because the TupleDesc has to be reasonable. And, if the tuple is known not to be dead already, and known not to be in a state where it could become dead while we're doing stuff, we can follow the TOAST pointer. I'm not sure whether it's worth trying to be that fancy or not. If we were only concerned about the mismatched-TupleDesc problem, this function could return true in a lot more cases. Once we get to the comment that says "Okay, the inserter committed..." we could just return true. Similarly, the HEAP_MOVED_IN and HEAP_MOVED_OFF cases could just skip all the interior test and return true, because if the tuple is being moved, the original inserter has to have committed. Conversely, however, the !HeapTupleHeaderXminCommitted -> TransactionIdIsCurrentTransactionId case probably ought to always return false. One could argue otherwise: if we're the inserter, then the only in-progress transaction that might have changed the TupleDesc is us, so we could just consider this case to be a true return value also, regardless of what's going on with xmax. In that case, we're not asking "did the inserter definitely commit?" but "are the inserter's possible DDL changes definitely visible to us?" which might be an OK definition too. However, the could-the-TOAST-data-disappear problem is another story. I don't see how we can answer that question correctly with the logic you've got here, because you have no XID threshold. Consider the case where we reach this code: + if (!(tuphdr->t_infomask & HEAP_XMAX_COMMITTED)) + { + if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmax(tuphdr))) + return true; /* HEAPTUPLE_DELETE_IN_PROGRESS */ + else if (!TransactionIdDidCommit(HeapTupleHeaderGetRawXmax(tuphdr))) + + /* + * Not in Progress, Not Committed, so either Aborted or crashed + */ + return true; /* HEAPTUPLE_LIVE */ + + /* At this point the xmax is known committed */ + } If we reach the case where the code comment says HEAPTUPLE_DELETE_IN_PROGRESS, we know that the tuple isn't dead right now, and so the TOAST tuples aren't dead either. But, by the time we go try to look at the TOAST tuples, they might have become dead and been pruned away, because the deleting transaction can commit at any time, and after that pruning can happen at any time. Our only guarantee that that won't happen is if the deleting XID is new enough that it's invisible to some snapshot that our backend has registered. That's approximately why HeapTupleSatisfiesVacuumHorizon needs to set *dead_after in this case and one other, and I think you have the same requirement. I just noticed that this whole thing has another, related problem: check_tuple_header_and_visibilty() and check_tuple_attribute() are called from within check_tuple(), which is called while we hold a buffer lock on the heap page. We should not be going and doing complex operations that might take their own buffer locks - like TOAST index checks - while we're holding an lwlock. That's going to have to be changed so that the TOAST pointer checking happens after UnlockReleaseBuffer(); in other words, we'll need to remember the TOAST pointers to go look up and actually look them up after UnlockReleaseBuffer(). But, when we do that, then the HEAPTUPLE_LIVE case above has the same race condition that is already present in the HEAPTUPLE_DELETE_IN_PROGRESS case: after we release the buffer pin, some other transaction might delete the tuple and the associated TOAST tuples, and they might then commit, and the tuple might become dead and get pruned away before we check the TOAST table. On a related note, I notice that your latest patch removes all the logic that complains about XIDs being out of bounds. I don't think that's good. Those seem like important checks. They're important for finding problems with the relation, and I think we probably also need them because of the XID-horizon issue mentioned above. One possible way of looking at it is to say that the XID_BOUNDS_OK case has two sub-cases: either the XID is within bounds and is one that cannot become all-visible concurrently because it's not visible to all of our backend's registered snapshots, or it's within bounds but does have the possibility of becoming all-visible. In the former case, if it appears as XMAX we can safely follow TOAST pointers found within the tuple; in the latter case, we can't. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: