Re: BUG #14180: Segmentation fault on replication slave - Mailing list pgsql-bugs

From Bo Ørsted Andresen
Subject Re: BUG #14180: Segmentation fault on replication slave
Date
Msg-id VI1PR04MB1488D19F84ADF821932A0218CB5E0@VI1PR04MB1488.eurprd04.prod.outlook.com
Whole thread Raw
In response to Re: BUG #14180: Segmentation fault on replication slave  (Bo Ørsted Andresen <boa@neogrid.dk>)
List pgsql-bugs
> > On 2016-06-07 20:08, Tom Lane wrote:
> > I think the reason for the lack of useful backtrace info is that we've
> > smashed the stack.  Note that the original report shows i == 3324
> > which is much larger than the available length of the local items[] array
> (408).
> > So presumably, the passed-in "len" was bogus (much too large).
> >
> > If you're prepared to build a custom version of Postgres, you could
> > try adding this to _bt_restore_page():
> >
> >         /* Need to copy tuple header due to alignment
> considerations */
> >         memcpy(&itupdata, from, sizeof(IndexTupleData));
> >         itemsz = IndexTupleDSize(itupdata);
> >         itemsz = MAXALIGN(itemsz);
> >
> > +        if (i >= lengthof(items))
> > +            elog(PANIC, "too many items on btree page");
> > +
> >         items[i] = (Item) from;
> >         itemsizes[i] = itemsz;
> >         i++;
> >
> >         from += itemsz;
> >
> > and then you should get a core dump before the stack is clobbered.
> >
> > I wonder whether we shouldn't add such a check to the regular sources...

Logged:

LOG:  started streaming WAL from primary at 631/7000000 on timeline 1
PANIC:  too many items on btree page
CONTEXT:  xlog redo Btree/SPLIT_R: level 0, firstright 139

Bacttrace:

# gdb -p 10069
GNU gdb (Ubuntu 7.11-0ubuntu1) 7.11
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 10069
Reading symbols from /usr/local/pgsql/bin/postgres...done.
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/librt-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/libdl-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/libm-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/libc-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...Reading symbols from
/usr/lib/debug/.build-id/b7/7847cc9cacbca3b5753d0d25a32e5795afe75b.debug...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/ld-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libnss_files.so.2...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/libnss_files-2.23.so...done.
done.
0x00007ffff73f3e70 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:84
84      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) set pagination off
(gdb) set logging file /tmp/debuglog-20160608-2.txt
(gdb) set logging on
Copying output to /tmp/debuglog-20160608-2.txt.
(gdb) handle SIGUSR1 nostop
Signal        Stop      Print   Pass to program Description
SIGUSR1       No        Yes     Yes             User defined signal 1
(gdb) handle SIGUSR1 noprint
Signal        Stop      Print   Pass to program Description
SIGUSR1       No        No      Yes             User defined signal 1
(gdb) cont
Continuing.
(gdb)
Program received signal SIGABRT, Aborted.
0x00007ffff732e418 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff732e418 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff733001a in __GI_abort () at abort.c:89
#2  0x000000000078ccaa in errfinish (dummy=dummy@entry=0) at elog.c:551
#3  0x000000000079074a in elog_finish (elevel=elevel@entry=22, fmt=fmt@entry=0x7cb187 "too many items on btree page")
atelog.c:1368 
#4  0x00000000004ae437 in _bt_restore_page (page=page@entry=0x7fffefa2cb40 "", from=<optimized out>,
from@entry=0xc52e70"\036", len=<optimized out>) at nbtxlog.c:58 
#5  0x00000000004ae8a4 in btree_xlog_split (onleft=onleft@entry=0 '\000', isroot=isroot@entry=0 '\000',
record=record@entry=0xc3b840)at nbtxlog.c:241 
#6  0x00000000004aee1c in btree_redo (record=0xc3b840) at nbtxlog.c:984
#7  0x00000000004d5c2b in StartupXLOG () at xlog.c:6825
#8  0x000000000064e212 in StartupProcessMain () at startup.c:215
#9  0x00000000004e3168 in AuxiliaryProcessMain (argc=argc@entry=2, argv=argv@entry=0x7fffffffe3e0) at bootstrap.c:418
#10 0x000000000064b698 in StartChildProcess (type=StartupProcess) at postmaster.c:5199
#11 0x000000000064dc84 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0xc1b9f0) at postmaster.c:1284
#12 0x0000000000467950 in main (argc=3, argv=0xc1b9f0) at main.c:228

Regards,
Bo Ørsted Andresen



pgsql-bugs by date:

Previous
From: Francisco Olarte
Date:
Subject: Re: Case in Order By Ignored without warning or error
Next
From: Emiel Hermsen
Date:
Subject: Re: Case in Order By Ignored without warning or error