Thread: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST
backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST
My database version is Postgresql 13.2 , backup server core when redo a btree_xlog_insert that type is XLOG_BTREE_INSERT_POST
#0 0x00002aab66695d86inmemmove ssse3 krom /lib64/libc.so.6
#1 0x00000000004f5574in_bt_swap_posting(newitem=0x125d998,oposting=0x2aabcd55dcc8ostingoff=13) at nbtdedup.c:796
#2 0x000000000050cf08 in btree_xlog_insert(isleaf=true,ismeta=false, posting=true,r ecord=0x122d7b8)at nbtxlog.c:224
43 0x000000000050ed53 in btree_redo(record=0x122d7b8) at nbtxlog.c:969
44 0x00000000005487b5 in StartupXLOG()at log.c:7320
#5 0x00000000008292ab in StartupProcessMain () atstartup.c:204
46 0x000000000055c916 in AuxiliaryProcessMain(argc=2,argv=0x7fff071f8a00) at bootstrap.c:443
#7 0x000000000082820d in StartChildProcess(type=StartupProcess) at postmaster.c:5492
48 0x00000000008236ce in PostmasterMain(argc=1argv=0x11fb2e0)at postmaster.c:1404
#9 0x00000000007398f7 in main(argc=1,argv=0x11fb2e0)at main.c:210
gdb) f 1
#1 0x00000000004f5574in btswap_posting(newitem=0x125d998,oposting=0x2aabcd55dcc8ostingoff=13) at nbtdedup.c.796
796
gdb) p((Size)((oposting)->t_info & 0x1FFF)) //oposting index tuple size
$12 = 16
gdb) p nhtids
$13= 17
gdb) p postingoff
$14=13
gdb) p nmovebytes
$15=18
IndexTuple
_bt_swap_posting(IndexTuple newitem, IndexTuple oposting, int postingoff)
{
int nhtids;
char *replacepos;
char *replaceposright;
Size nmovebytes;
IndexTuple nposting;
nhtids = BTreeTupleGetNPosting(oposting);
=== nhtids = 17, postingoff = 13
Assert(_bt_posting_valid(oposting));
Assert(postingoff > 0 && postingoff < nhtids);
/*
* Move item pointers in posting list to make a gap for the new item's
* heap TID. We shift TIDs one place to the right, losing original
* rightmost TID. (nmovebytes must not include TIDs to the left of
* postingoff, nor the existing rightmost/max TID that gets overwritten.)
*/
nposting = CopyIndexTuple(oposting);
=== IndexTupleSize(oposting) = 16, nposting memory size 16
replacepos = (char *) BTreeTupleGetPostingN(nposting, postingoff);
replaceposright = (char *) BTreeTupleGetPostingN(nposting, postingoff + 1);
nmovebytes = (nhtids - postingoff - 1) * sizeof(ItemPointerData);
memmove(replaceposright, replacepos, nmovebytes);
===core here, nmovebytes = 18, for nposting size is 16, so here is out of memory
}
Re: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST
On Thu, Nov 21, 2024 at 10:03 AM yuansong <yyuansong@126.com> wrote: > Should nhtids be less than or equal to IndexTupleSize(oposting)? > Why is nhtids larger than IndexTupleSize(oposting) ? I think there should be an error in the master host writing the wallog. > Does anyone know when this will happen? It'll happen whenever there is a certain kind of data corruption. There were complaints about issues like this in the past. But those complaints seem to have gone away when more hardening was added to the code that runs during original execution (not the REDO routine code, which can only do what it is told to do by the WAL record). You're using PostgreSQL 13.2, which is a very old point release that lacks this hardening -- the current 13 point release is 13.18, so you're missing a lot. Had you been on a later point release you'd very probably have still had the issue with corruption (which could be from bad hardware), but you likely would have avoided the problem with the REDO routine crashing like this. -- Peter Geoghegan
Re:Re: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST
There may be something wrong with my previous description, "Should nhtids be less than or equal to IndexTupleSize(oposting)?
Why is nhtids larger than IndexTupleSize(oposting) " Here nhtids should be nmovebytes.
It is normal whether nhtids is larger than IndexTupleSize(oposting) or smaller than IndexTupleSize(oposting).
Should nmovebytes be smaller than IndexTupleSize(oposting)?
At 2024-11-21 23:58:03, "Peter Geoghegan" <pg@bowt.ie> wrote: >On Thu, Nov 21, 2024 at 10:03 AM yuansong <yyuansong@126.com> wrote: >> Should nhtids be less than or equal to IndexTupleSize(oposting)? >> Why is nhtids larger than IndexTupleSize(oposting) ? I think there should be an error in the master host writing the wal log. >> Does anyone know when this will happen? > >It'll happen whenever there is a certain kind of data corruption. > >There were complaints about issues like this in the past. But those >complaints seem to have gone away when more hardening was added to the >code that runs during original execution (not the REDO routine code, >which can only do what it is told to do by the WAL record). > >You're using PostgreSQL 13.2, which is a very old point release that >lacks this hardening -- the current 13 point release is 13.18, so >you're missing a lot. Had you been on a later point release you'd very >probably have still had the issue with corruption (which could be from >bad hardware), but you likely would have avoided the problem with the >REDO routine crashing like this. > >-- >Peter Geoghegan