Re: HOT chain validation in verify_heapam() - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: HOT chain validation in verify_heapam()
Date
Msg-id CAH2-Wzmsa0yMS-JsP5_778VNG1VLAL5xO-EgxhLtBJ9KZ=gJmA@mail.gmail.com
Whole thread Raw
In response to Re: HOT chain validation in verify_heapam()  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Nov 14, 2022 at 11:28 AM Robert Haas <robertmhaas@gmail.com> wrote:
> Part of the motivation here is also driven by trying to figure out how
> to word the complaints. We have a dedicated field in the amcheck that
> can hold one tuple offset or the other, but if we're checking the
> relationships between tuples, what do we put there? I feel it will be
> easiest to understand if we put the offset of the older tuple in that
> field and then phrase the complaint as the patch does, e.g.:

That makes a lot of sense to me, and reminds me of how things work in
verify_nbtree.c.

At a high level verify_nbtree.c works by doing a breadth-first
traversal of the tree. The search makes each distinct page the "target
page" exactly once. The target page is the clear focal point for
everything -- almost every complaint about corruption frames the
problem as a problem in the target page. We consistently describe
things in terms of their relationship with the target page, so under
this scheme everybody is...on the same page (ahem).

Being very deliberate about that probably had some small downsides.
Maybe it would have made a little more sense to word certain
particular corruption report messages in a way that placed blame on
"ancillary" pages like sibling/child pages (not the target page) as
problems in the ancillary page itself, not the target page. This still
seems like the right trade-off -- the control flow can be broken up
into understandable parts once you understand that the target page is
the thing that we use to describe every other page.

> > I'm doubtful it's a good idea to try to validate the 9.4 case. The likelihood
> > of getting that right seems low and I don't see us gaining much by even trying.
>
> I agree with Peter. We have to try to get that case right. If we can
> eventually eliminate it as a valid case by some mechanism, hooray. But
> in the meantime we have to deal with it as best we can.

Practiced intellectual humility seems like the way to go here. On some
level I suspect that we'll have problems in exactly the places that we
don't look for them.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Add sub-transaction overflow status in pg_stat_activity
Next
From: Andres Freund
Date:
Subject: Re: HOT chain validation in verify_heapam()