Re: stress test for parallel workers - Mailing list pgsql-hackers

From Tom Lane
Subject Re: stress test for parallel workers
Date
Msg-id 17884.1570915552@sss.pgh.pa.us
Whole thread Raw
In response to Re: stress test for parallel workers  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: stress test for parallel workers
List pgsql-hackers
I've now also been able to reproduce the "infinite_recurse" segfault
on wobbegong's host (or, since I was using a gcc build, I guess I
should say vulpes' host).  The first-order result is that it's the
same problem with the kernel not giving us as much stack space as
we expect: there's only 1179648 bytes in the stack segment in the
core dump, though we should certainly have been allowed at least 8MB.

The next interesting thing is that looking closely at the identified
spot of the SIGSEGV, there's nothing there that should be touching
the stack at all:

(gdb) x/4i $pc
=> 0x10201df0 <core_yylex+1072>:        ld      r9,0(r30)
   0x10201df4 <core_yylex+1076>:        ld      r8,128(r30)
   0x10201df8 <core_yylex+1080>:        ld      r10,152(r30)
   0x10201dfc <core_yylex+1084>:        ld      r9,0(r9)

(r30 is not pointing at the stack, but at a valid heap location.)
This code is the start of the switch case at scan.l:1064, so the
most recent successfully-executed instructions were the switch jump,
and they don't involve the stack either.

The reported sp,

(gdb) i reg sp
sp             0x7fffe6940890   140737061849232

is a good 2192 bytes above the bottom of the allocated stack space,
which is 0x7fffe6940000 according to gdb.  So we really ought to
have plenty of margin here.  What's going on?

What I suspect, given the difficulty of reproducing this, is that
what really happened is that the kernel tried to deliver a SIGUSR1
signal to us just at this point.  The kernel source code that
Thomas pointed to comments that

     * The kernel signal delivery code writes up to about 1.5kB
     * below the stack pointer (r1) before decrementing it.

There's more than 1.5kB available below sp, but what if that comment
is a lie?  In particular, I'm wondering if that number dates to PPC32
and needs to be doubled, or nearly so, to describe PPC64 reality.
If that were the case, then the signal code would not have been
able to fit its requirement, and would probably have come here to
ask for more stack space, and the hard-wired 2048 test a little
further down would have decided that that was a wild stack access.

In short, my current belief is that Linux PPC64 fails when trying
to deliver a signal if there's right around 2KB of stack remaining,
even though it should be able to expand the stack and press on.

It may well be that the reason is just that this heuristic in
bad_stack_expansion() is out of date.  Or there might be a similarly
bogus value somewhere in the signal-delivery code.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Re: adding partitioned tables to publications
Next
From: Justin Pryzby
Date:
Subject: Re: v12.0: ERROR: could not find pathkey item to sort