Re:Re:Re:Re: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST - Mailing list pgsql-hackers
From | yuansong |
---|---|
Subject | Re:Re:Re:Re: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST |
Date | |
Msg-id | 39b023e9.1ed0.19382575335.Coremail.yyuansong@126.com Whole thread Raw |
In response to | backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST (yuansong <yyuansong@126.com>) |
Responses |
Re: Re:Re:Re: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST
|
List | pgsql-hackers |
the _bt_binsrch_insert
function always returns low
, but during the post list search, are there cases where low
and mid
are unequal?
If so, this could potentially cause an offset in the subsequent _bt_insertonpg
function.
maybe we fix it like this ?
OffsetNumber
_bt_binsrch_insert(Relation rel, BTInsertState insertstate)
{
......
while (high > low)
{
OffsetNumber mid = low + ((high - low) / 2);
/*
* If tuple at offset located by binary search is a posting list whose
* TID range overlaps with caller's scantid, perform posting list
* binary search to set postingoff for caller. Caller must split the
* posting list when postingoff is set. This should happen
* infrequently.
*/
if (unlikely(result == 0 && key->scantid != NULL))
{
/*
* postingoff should never be set more than once per leaf page
* binary search. That would mean that there are duplicate table
* TIDs in the index, which is never okay. Check for that here.
*/
if (insertstate->postingoff != 0)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg_internal("table tid from new index tuple (%u,%u) cannot find insert offset between offsets %u and %u of block %u in index \"%s\"",
ItemPointerGetBlockNumber(key->scantid),
ItemPointerGetOffsetNumber(key->scantid),
low, stricthigh,
BufferGetBlockNumber(insertstate->buf),
RelationGetRelationName(rel))));
insertstate->postingoff = _bt_binsrch_posting(key, page, mid);
// Here, will low
and mid
ever be unequal? If low
is returned in such cases, would it result in an error? maybe we fix it like this ?
// low = mid;
// break;
}
}
........
return low;
}
At 2024-11-27 18:53:20, "yuansong" <yyuansong@126.com> wrote:
we find crash reson
We have identified the cause of the crash: it was due to the XLOG_BTREE_INSERT_POST XLOG having an OffsetNumber offnum that was one less than what was stored in the index. I experimented with adding +1, and the index data remained normal in both cases. This issue is likely caused by concurrent operations on the B-tree, and upon reviewing the corresponding WAL logs, we found SPLIT_L and INSERT_LEAF operations on the same block before the crash. This might be a bug. I'm not sure if there's a related fix.
At 2024-11-21 23:58:03, "Peter Geoghegan" <pg@bowt.ie> wrote: >On Thu, Nov 21, 2024 at 10:03 AM yuansong <yyuansong@126.com> wrote: >> Should nhtids be less than or equal to IndexTupleSize(oposting)? >> Why is nhtids larger than IndexTupleSize(oposting) ? I think there should be an error in the master host writing the wal log. >> Does anyone know when this will happen? > >It'll happen whenever there is a certain kind of data corruption. > >There were complaints about issues like this in the past. But those >complaints seem to have gone away when more hardening was added to the >code that runs during original execution (not the REDO routine code, >which can only do what it is told to do by the WAL record). > >You're using PostgreSQL 13.2, which is a very old point release that >lacks this hardening -- the current 13 point release is 13.18, so >you're missing a lot. Had you been on a later point release you'd very >probably have still had the issue with corruption (which could be from >bad hardware), but you likely would have avoided the problem with the >REDO routine crashing like this. > >-- >Peter Geoghegan
pgsql-hackers by date: