I wrote:
> Right, but _bt_getstackbuf is working from a search stack created by
> a standard search for the victim page's high key. If that search
> descended through a page to the right of the victim page's actual
> parent, _bt_getstackbuf isn't able to recover.
What I'm tempted to do, at least in the back branches, is simply adjust
_bt_pagedel to be able to recover from _bt_getstackbuf failure in this
scenario. It could use the same method that _bt_insert_parent does in
the concurrent-root-split case, ie (untested):
ItemPointerSet(&(stack->bts_btentry.t_tid), target, P_HIKEY);pbuf = _bt_getstackbuf(rel, stack, BT_WRITE);if (pbuf ==
InvalidBuffer)
+ {
+ /* Find the leftmost page at the next level up */
+ pbuf = _bt_get_endpoint(rel, opaque->btpo.level + 1, false);
+ stack->bts_blkno = BufferGetBlockNumber(pbuf);
+ stack->bts_offset = InvalidOffsetNumber;
+ _bt_relbuf(rel, pbuf);
+ /* and repeat search from there */
+ pbuf = _bt_getstackbuf(rel, stack, BT_WRITE);
+ if (pbuf == InvalidBuffer) elog(ERROR, "failed to re-find parent key in \"%s\"",
RelationGetRelationName(rel));
+ }parent = stack->bts_blkno;poffset = stack->bts_offset;
The question is whether we want a cleaner answer for future development,
and if so what that answer ought to look like. It seems like the
alternatives we've been discussing may not end up any simpler/shorter
than the current code, and it seems hard to justify giving up some
concurrency in the name of a simplification that doesn't simplify much.
Thoughts?
regards, tom lane