On Mon, Oct 23, 2023 at 04:46:23PM -0700, Peter Geoghegan wrote:
> On Fri, Oct 20, 2023 at 8:55 PM Noah Misch <noah@leadboat.com> wrote:
> > > > I lean toward fixing this by
> > > > having amcheck scan left; if left links reach only half-dead or deleted pages,
> > > > that's as good as the present child block being P_LEFTMOST.
> > >
> > > Also my preference.
> >
> > Done mostly that way, except I didn't accept deleted pages. Making this work
> > on !readonly would take more than that, and readonly shouldn't need that.
>
> That makes sense to me. I believe that it's not possible to have a
> string of consecutive sibling pages that are all half-dead (regardless
> of the BlockNumber order of sibling pages, even). But I'd probably
> have written the fix in roughly the same way. Although...maybe you
> should try to detect a string of half-dead pages? Hard to say if it's
> worth the trouble.
I imagined a string of half-dead siblings could arise in structure like this:
* 1
* / | \
* 4 <-> 2 <-> 3
With events like this:
- DELETE renders blk 4 deletable.
- Crash with concurrent VACUUM, leaving 4 half-dead after having visited 1-4.
- DELETE renders blk 2 deletable.
- Crash with concurrent VACUUM, leaving 2 half-dead after having visited 1-2.
I didn't try to reproduce that, and something may well prevent it.
> Suggest adding a CHECK_FOR_INTERRUPTS() call to the loop, too, just
> for good luck.
Added. That gave me the idea to check for circular links, like other parts of
amcheck do. Net diff:
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -949,11 +949,16 @@ bt_leftmost_ignoring_half_dead(BtreeCheckState *state,
Page page = palloc_btree_page(state, reached);
BTPageOpaque reached_opaque = BTPageGetOpaque(page);
+ CHECK_FOR_INTERRUPTS();
+
/*
- * _bt_unlink_halfdead_page() writes that side-links will continue to
- * point to the siblings. We can easily check btpo_next.
+ * Try to detect btpo_prev circular links. _bt_unlink_halfdead_page()
+ * writes that side-links will continue to point to the siblings.
+ * Check btpo_next for that property.
*/
- all_half_dead = P_ISHALFDEAD(reached_opaque) &&
+ all_half_dead = P_ISHALFDEAD(reached_opaque);
+ reached != start &&
+ reached != reached_from &&
reached_opaque->btpo_next == reached_from;
if (all_half_dead)
{