Re: PG17.6 wal apply bug (SIGSEGV) - Mailing list pgsql-bugs

From badfilez@gmail.com
Subject Re: PG17.6 wal apply bug (SIGSEGV)
Date
Msg-id bc81dce6-3e38-46ad-92e1-7783560bb9a2@gmail.com
Whole thread Raw
Responses Re: PG17.6 wal apply bug (SIGSEGV)
List pgsql-bugs
backrtace

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000057eff2 in _bt_restore_page (page=0x7f6f48fd1000 "", from=0x7f6fe2eccd80 "", len=<optimized out>) at nbtxlog.c:63
63            itemsz = MAXALIGN(itemsz);
(gdb) bt full
#0  0x000000000057eff2 in _bt_restore_page (page=0x7f6f48fd1000 "", from=0x7f6fe2eccd80 "", len=<optimized out>) at nbtxlog.c:63
        itupdata = <optimized out>
        itemsz = 0
        end = 0x7f6fe2ecd8c0 "(\265/\375`\260\005\205\023"
        items = {0x0 <repeats 227 times>, 0x7f6f00000000 "\211\243\362hw\366\371\003\b", 0x7f6fe2eccd80 "" <repeats 180 times>}
        itemsizes = {24 <repeats 33 times>, 0 <repeats 375 times>}
        i = 1318
        nitems = <optimized out>
        __func__ = "_bt_restore_page"
        __errno_location = <optimized out>


On 20/10/2025 13:58, badfilez@gmail.com wrote:
Hello,

Postgres 17 cluster from official repo on RHEL8 (master and 2 replicas)

on both replicas, I get

2025-10-18 15:40:50.843 MSK [1448] LOG:  entering standby mode
2025-10-18 15:40:50.865 MSK [1448] LOG:  redo starts at 1F35/D08DE298
2025-10-18 15:41:14.553 MSK [1381] LOG:  startup process (PID 1448) was terminated by signal 11: Segmentation fault
2025-10-18 15:41:14.553 MSK [1381] LOG:  terminating any other active server processes
2025-10-18 15:41:14.555 MSK [1381] LOG:  shutting down due to startup process failure
2025-10-18 15:41:14.677 MSK [1381] LOG:  database system is shut down

After debugging,

replica recovery creates corrupted index file from wal,
waldump does not show any wal corruption, no prior io errors in logs
master has not crashed and working ok, no errors in log

the operation on which segfault happens is (if i stop recovery on previous operation it does not trigger segfault)

rmgr: Btree len (rec/tot): 3758/ 5774, tx: 1711720455, lsn: 1F36/30E3C7B8, prev 1F36/30E3C760, desc: SPLIT_L level: 0, firstrightoff: 140, newitemoff: 140, postingoff: 0, blkref #0: rel 1663/16385/151181595 blk 63203 FPW, blkref #1: rel 1663/16385/151181595 blk 112208, blkref #2: rel 1663/16385/151181595 blk 108144 FPW

the wal segment containing the instruction attached



pgsql-bugs by date:

Previous
From: Marco Boeringa
Date:
Subject: Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
Next
From: Peter Geoghegan
Date:
Subject: Re: PG17.6 wal apply bug (SIGSEGV)