Re: HOT chain validation in verify_heapam() - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: HOT chain validation in verify_heapam() |
Date | |
Msg-id | CA+TgmoYD-mx2=2KhW8815p1rU=2Ftndn2O4UEk87JbPN7qrwtQ@mail.gmail.com Whole thread Raw |
In response to | Re: HOT chain validation in verify_heapam() (Himanshu Upadhyaya <upadhyaya.himanshu@gmail.com>) |
Responses |
Re: HOT chain validation in verify_heapam()
Re: HOT chain validation in verify_heapam() |
List | pgsql-hackers |
On Tue, Sep 20, 2022 at 5:00 AM Himanshu Upadhyaya <upadhyaya.himanshu@gmail.com> wrote: > Please find it attached. This patch still has no test cases. Just as we have test cases for the existing corruption checks, we should have test cases for these new corruption checks, showing cases where they actually fire. I think I would be inclined to set lp_valid[x] = true in both the redirected and non-redirected case, and then have the very first thing that the second loop does be if (nextoffnum == 0 || !lp_valid[ctx.offnum]) continue. I think that would be more clear about the intent to ignore line pointers that failed validation. Also, if you did it that way, then you could catch the case of a redirected line pointer pointing to another redirected line pointer, which is a corruption condition that the current code does not appear to check. + /* + * Validation via the predecessor array. 1) If the predecessor's + * xmin is aborted or in progress, the current tuples xmin should + * be aborted or in progress respectively. Also both xmin's must + * be equal. 2) If the predecessor's xmin is not frozen, then + * current tuple's shouldn't be either. 3) If the predecessor's + * xmin is equal to the current tuple's xmin, the current tuple's + * cmin should be greater than the predecessor's cmin. 4) If the + * current tuple is not HOT then its predecessor's tuple must not + * be HEAP_HOT_UPDATED. 5) If the current tuple is HOT then its + * predecessor's tuple must be HEAP_HOT_UPDATED. + */ This comment needs to be split up into pieces and the pieces need to be moved closer to the tests to which they correspond. + psprintf("unfrozen tuple was updated to produce a tuple at offset %u which is not frozen", Shouldn't this say "which is frozen"? + * Not a corruption if current tuple is updated/deleted by a + * different transaction, means t_cid will point to cmax (that is + * command id of deleting transaction) and cid of predecessor not + * necessarily will be smaller than cid of current tuple. t_cid I think that the next person who reads this code is likely to understand that the CIDs of different transactions are numerically unrelated. What's less obvious is that if the XID is the same, the newer update must have a higher CID. + * can hold combo command id but we are not worrying here since + * combo command id of the next updated tuple (if present) must be + * greater than combo command id of the current tuple. So here we + * are not checking HEAP_COMBOCID flag and simply doing t_cid + * comparison. I disapprove of ignoring the HEAP_COMBOCID flag. Emitting a message claiming that the CID has a certain value when that's actually a combo CID is misleading, so at least a different message wording is needed in such cases. But it's also not clear to me that the newer update has to have a higher combo CID, because combo CIDs can be reused. If you have multiple cursors open in the same transaction, the updates can be interleaved, and it seems to me that it might be possible for an older CID to have created a certain combo CID after a newer CID, and then both cursors could update the same page in succession and end up with combo CIDs out of numerical order. Unless somebody can make a convincing argument that this isn't possible, I think we should just skip this check for cases where either tuple has a combo CID. + if (TransactionIdEquals(pred_xmin, curr_xmin) && + (TransactionIdEquals(curr_xmin, curr_xmax) || + !TransactionIdIsValid(curr_xmax)) && pred_cmin >= curr_cmin) I don't understand the reason for the middle part of this condition -- TransactionIdEquals(curr_xmin, curr_xmax) || !TransactionIdIsValid(curr_xmax). I suppose the comment is meant to explain this, but I still don't get it. If a tuple with XMIN 12345 CMIN 2 is updated to produce a tuple with XMIN 12345 CMIN 1, that's corruption, regardless of what the XMAX of the second tuple may happen to be. + if (HeapTupleHeaderIsHeapOnly(curr_htup) && + !HeapTupleHeaderIsHotUpdated(pred_htup)) + if (!HeapTupleHeaderIsHeapOnly(curr_htup) && + HeapTupleHeaderIsHotUpdated(pred_htup)) I think it would be slightly clearer to write these tests the other way around i.e. check the previous tuple's state first. + if (!TransactionIdIsValid(curr_xmax) && HeapTupleHeaderIsHotUpdated(tuphdr)) + { + report_corruption(ctx, + psprintf("tuple has been updated, but xmax is 0")); + result = false; + } I guess this message needs to say "tuple has been HOT updated, but xmax is 0" or something like that. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: