Re: HOT chain validation in verify_heapam() - Mailing list pgsql-hackers

From Himanshu Upadhyaya
Subject Re: HOT chain validation in verify_heapam()
Date
Msg-id CAPF61jCfHvhoaiYhmD3DHP_LGnzVo5qbAYefb3BVHnShFTG32A@mail.gmail.com
Whole thread Raw
In response to Re: HOT chain validation in verify_heapam()  (Andres Freund <andres@anarazel.de>)
Responses Re: HOT chain validation in verify_heapam()
List pgsql-hackers


On Thu, Nov 10, 2022 at 3:38 AM Andres Freund <andres@anarazel.de> wrote:

> +             }
> +
> +             /*
> +              * Loop over offset and populate predecessor array from all entries
> +              * that are present in successor array.
> +              */
> +             ctx.attnum = -1;
> +             for (ctx.offnum = FirstOffsetNumber; ctx.offnum <= maxoff;
> +                      ctx.offnum = OffsetNumberNext(ctx.offnum))
> +             {
> +                     ItemId          curr_lp;
> +                     ItemId          next_lp;
> +                     HeapTupleHeader curr_htup;
> +                     HeapTupleHeader next_htup;
> +                     TransactionId curr_xmax;
> +                     TransactionId next_xmin;
> +
> +                     OffsetNumber nextoffnum = successor[ctx.offnum];
> +
> +                     curr_lp = PageGetItemId(ctx.page, ctx.offnum);

Why do we get the item when nextoffnum is 0?

Fixed by moving  PageGetItemId() call after the 'if' check.


> +                     if (nextoffnum == 0 || !lp_valid[ctx.offnum] || !lp_valid[nextoffnum])
> +                     {
> +                             /*
> +                              * This is either the last updated tuple in the chain or a
> +                              * corruption raised for this tuple.
> +                              */

"or a corruption raised" isn't quite right grammatically.
 
done.

> +                             continue;
> +                     }
> +                     if (ItemIdIsRedirected(curr_lp))
> +                     {
> +                             next_lp = PageGetItemId(ctx.page, nextoffnum);
> +                             if (ItemIdIsRedirected(next_lp))
> +                             {
> +                                     report_corruption(&ctx,
> +                                                                       psprintf("redirected line pointer pointing to another redirected line pointer at offset %u",
> +                                                                                        (unsigned) nextoffnum));
> +                                     continue;
> +                             }
> +                             next_htup = (HeapTupleHeader) PageGetItem(ctx.page, next_lp);
> +                             if (!HeapTupleHeaderIsHeapOnly(next_htup))
> +                             {
> +                                     report_corruption(&ctx,
> +                                                                       psprintf("redirected tuple at line pointer offset %u is not heap only tuple",
> +                                                                                        (unsigned) nextoffnum));
> +                             }
> +                             if ((next_htup->t_infomask & HEAP_UPDATED) == 0)
> +                             {
> +                                     report_corruption(&ctx,
> +                                                                       psprintf("redirected tuple at line pointer offset %u is not heap updated tuple",
> +                                                                                        (unsigned) nextoffnum));
> +                             }
> +                             continue;
> +                     }
> +
> +                     /*
> +                      * Add a line pointer offset to the predecessor array if xmax is
> +                      * matching with xmin of next tuple (reaching via its t_ctid).
> +                      * Prior to PostgreSQL 9.4, we actually changed the xmin to
> +                      * FrozenTransactionId

I'm doubtful it's a good idea to try to validate the 9.4 case. The likelihood
of getting that right seems low and I don't see us gaining much by even trying.



removed code with regards to frozen tuple checks.

> so we must add offset to predecessor
> +                      * array(irrespective of xmax-xmin matching) if updated tuple xmin
> +                      * is frozen, so that we can later do validation related to frozen
> +                      * xmin. Raise corruption if we have two tuples having the same
> +                      * predecessor.
> +                      * We add the offset to the predecessor array irrespective of the
> +                      * transaction (t_xmin) status. We will do validation related to
> +                      * the transaction status (and also all other validations) when we
> +                      * loop over the predecessor array.
> +                      */
> +                     curr_htup = (HeapTupleHeader) PageGetItem(ctx.page, curr_lp);
> +                     curr_xmax = HeapTupleHeaderGetUpdateXid(curr_htup);
> +                     next_lp = PageGetItemId(ctx.page, nextoffnum);
> +                     next_htup = (HeapTupleHeader) PageGetItem(ctx.page, next_lp);
> +                     next_xmin = HeapTupleHeaderGetXmin(next_htup);
> +                     if (TransactionIdIsValid(curr_xmax) &&
> +                             (TransactionIdEquals(curr_xmax, next_xmin) ||
> +                              next_xmin == FrozenTransactionId))
> +                     {
> +                             if (predecessor[nextoffnum] != 0)
> +                             {
> +                                     report_corruption(&ctx,
> +                                                                       psprintf("updated version at offset %u is also the updated version of tuple at offset %u",
> +                                                                                        (unsigned) nextoffnum, (unsigned) predecessor[nextoffnum]));
> +                                     continue;

I doubt it is correct to enter this path with next_xmin ==
FrozenTransactionId. This is following a ctid chain that we normally wouldn't
follow, because it doesn't satisfy the t_self->xmax == t_ctid->xmin condition.

removed this frozen check.

> +             }
> +
> +             /* Loop over offsets and validate the data in the predecessor array. */
> +             for (OffsetNumber currentoffnum = FirstOffsetNumber; currentoffnum <= maxoff;
> +                      currentoffnum = OffsetNumberNext(currentoffnum))
> +             {
> +                     HeapTupleHeader pred_htup;
> +                     HeapTupleHeader curr_htup;
> +                     TransactionId pred_xmin;
> +                     TransactionId curr_xmin;
> +                     ItemId          pred_lp;
> +                     ItemId          curr_lp;
> +
> +                     ctx.offnum = predecessor[currentoffnum];
> +                     ctx.attnum = -1;
> +
> +                     if (ctx.offnum == 0)
> +                     {
> +                             /*
> +                              * Either the root of the chain or an xmin-aborted tuple from
> +                              * an abandoned portion of the HOT chain.
> +                              */

Hm - couldn't we check that the tuple could conceivably be at the root of a
chain? I.e. isn't HEAP_HOT_UPDATED? Or alternatively has an aborted xmin?

Done, I have added code to identify cases of missing offset in the predecessor[] array and added validation that root of the chain must not be HEAP_ONLY_TUPLE.

> +                             continue;
> +                     }
> +
> +                     curr_lp = PageGetItemId(ctx.page, currentoffnum);
> +                     curr_htup = (HeapTupleHeader) PageGetItem(ctx.page, curr_lp);
> +                     curr_xmin = HeapTupleHeaderGetXmin(curr_htup);
> +
> +                     ctx.itemid = pred_lp = PageGetItemId(ctx.page, ctx.offnum);
> +                     pred_htup = (HeapTupleHeader) PageGetItem(ctx.page, pred_lp);
> +                     pred_xmin = HeapTupleHeaderGetXmin(pred_htup);
> +
> +                     /*
> +                      * If the predecessor's xmin is aborted or in progress, the
> +                      * current tuples xmin should be aborted or in progress
> +                      * respectively. Also both xmin's must be equal.
> +                      */
> +                     if (!TransactionIdEquals(pred_xmin, curr_xmin) &&
> +                             !TransactionIdDidCommit(pred_xmin))
> +                     {
> +                             report_corruption(&ctx,
> +                                                               psprintf("tuple with uncommitted xmin %u was updated to produce a tuple at offset %u with differing xmin %u",
> +                                                                                (unsigned) pred_xmin, (unsigned) currentoffnum, (unsigned) curr_xmin));

Is this necessarily true? What about a tuple that was inserted in a
subtransaction and then updated in another subtransaction of the same toplevel
transaction?

 
patch has been updated to handle cases of sub-transaction. 


--
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: "Fujii.Yuki@df.MitsubishiElectric.co.jp"
Date:
Subject: RE: Partial aggregates pushdown
Next
From: Amit Kapila
Date:
Subject: Re: Perform streaming logical transactions by background workers and parallel apply