Hi,
On 2023-03-23 11:41:52 -0400, Robert Haas wrote:
> On Wed, Mar 22, 2023 at 5:56 PM Andres Freund <andres@anarazel.de> wrote:
> > I also think it's not quite right that some of checks inside if
> > (ItemIdIsRedirected()) continue in case of corruption, others don't. While
> > there's a later continue, that means the corrupt tuples get added to the
> > predecessor array. Similarly, in the non-redirect portion, the successor
> > array gets filled with corrupt tuples, which doesn't seem quite right to me.
>
> I'm not entirely sure I agree with this. I mean, we should skip
> further checks of the data is so corrupt that further checks are not
> sensible e.g. if the line pointer is just gibberish we can't validate
> anything about the tuple, because we can't even find the tuple.
> However, we don't want to skip further checks as soon as we see any
> kind of a problem whatsoever. For instance, even if the tuple data is
> utter gibberish, that should not and does not keep us from checking
> whether the update chain looks sane. If the tuple header is garbage
> (e.g. xmin and xmax are outside the bounds of clog) then at least some
> of the update-chain checks are not possible, because we can't know the
> commit status of the tuples, but garbage in the tuple data isn't
> really a problem for update chain validation per se.
The cases I was complaining about where metadata that's important. We
shouldn't enter a tuple into lp_valid[] or successor[] if it failed validity
checks - the subsequent error reports that that generates won't be helpful -
or we'll crash.
E.g. continuing after:
rditem = PageGetItemId(ctx.page, rdoffnum);
if (!ItemIdIsUsed(rditem))
report_corruption(&ctx,
psprintf("line pointer redirection to unused item at offset %u",
(unsigned) rdoffnum));
means we'll look into the tuple in the "update chain validation" loop for
unused items. Where it afaict will lead to a crash or bogus results, because:
/* Can only redirect to a HOT tuple. */
next_htup = (HeapTupleHeader) PageGetItem(ctx.page, next_lp);
if (!HeapTupleHeaderIsHeapOnly(next_htup))
{
report_corruption(&ctx,
psprintf("redirected line pointer points to a non-heap-only tuple at offset %u",
(unsigned) nextoffnum));
}
will just dereference the unused item.
> - * Redirects are created by updates, so successor should be
> - * the result of an update.
> + * Redirects are created by HOT updates, so successor should
> + * be the result of an HOT update.
> + *
> + * XXX: HeapTupleHeaderIsHeapOnly() should always imply
> + * HEAP_UPDATED. This should be checked even when the tuple
> + * isn't a target of a redirect.
>
> Hmm, OK. So the question is where to put this check. Maybe inside
> check_tuple_header(), making it independent of the update chain
> validation stuff?
Yes, check_tuple_header sounds sensible to me.
Greetings,
Andres Freund