Re: BUG #14180: Segmentation fault on replication slave - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #14180: Segmentation fault on replication slave
Date
Msg-id 20105.1465322852@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #14180: Segmentation fault on replication slave  (Bo Ørsted Andresen <boa@neogrid.dk>)
Responses Re: BUG #14180: Segmentation fault on replication slave  (Bo Ørsted Andresen <boa@neogrid.dk>)
List pgsql-bugs
Bo Ørsted Andresen <boa@neogrid.dk> writes:
>> On 2016-06-07 19:41, Andres Freund wrote:
>> Any chance the running version of postgres is out of date with the installed
>> binaries / debug symbols?

> You mean that I upgraded without restarting postgres before the segfault?

I think the reason for the lack of useful backtrace info is that we've
smashed the stack.  Note that the original report shows i == 3324 which is
much larger than the available length of the local items[] array (408).
So presumably, the passed-in "len" was bogus (much too large).

If you're prepared to build a custom version of Postgres, you could
try adding this to _bt_restore_page():
    /* Need to copy tuple header due to alignment considerations */    memcpy(&itupdata, from, sizeof(IndexTupleData));
  itemsz = IndexTupleDSize(itupdata);    itemsz = MAXALIGN(itemsz); 

+        if (i >= lengthof(items))
+            elog(PANIC, "too many items on btree page");
+    items[i] = (Item) from;    itemsizes[i] = itemsz;    i++;
    from += itemsz;

and then you should get a core dump before the stack is clobbered.

I wonder whether we shouldn't add such a check to the regular sources...
        regards, tom lane



pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: BUG #14180: Segmentation fault on replication slave
Next
From: Bo Ørsted Andresen
Date:
Subject: Re: BUG #14180: Segmentation fault on replication slave