Re: stress test for parallel workers - Mailing list pgsql-hackers

From Andres Freund
Subject Re: stress test for parallel workers
Date
Msg-id 20191011203141.xjhgln2vwhemzvra@alap3.anarazel.de
Whole thread Raw
In response to Re: stress test for parallel workers  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: stress test for parallel workers
List pgsql-hackers
Hi,

On 2019-10-11 14:56:41 -0400, Tom Lane wrote:
> I still don't have a good explanation for why this only seems to
> happen in the pg_upgrade test sequence.  However, I did notice
> something very interesting: the postmaster crashes after consuming
> only about 1MB of stack space.  This is despite the prevailing
> setting of "ulimit -s" being 8192 (8MB).  I also confirmed that
> the value of max_stack_depth within the crashed process is 2048,
> which implies that get_stack_depth_rlimit got some value larger
> than 2MB from getrlimit(RLIMIT_STACK).  And yet, here we have
> a crash, and the process memory map confirms that only 1MB was
> allocated in the stack region.  So it's really hard to explain
> that as anything except a kernel bug: sometimes, the kernel
> doesn't give us as much stack as it promised it would.  And the
> machine is not loaded enough for there to be any rational
> resource-exhaustion excuse for that.

Linux expands stack space only on demand, thus it's possible to run out
of stack space while there ought to be stack space. Unfortunately that
during a stack expansion, which means there's no easy place to report
that.  I've seen this be hit in production on busy machines.

I wonder if the machine is configured with overcommit_memory=2,
i.e. don't overcommit.  cat /proc/sys/vm/overcommit_memory would tell.
What does grep -E '^(Mem|Commit)' /proc/meminfo show while it's
happening?

What does the signal information say? You can see it with
p $_siginfo
after receiving the signal. A SIGSEGV here, I assume.

IIRC si_code and si_errno should indicate whether ENOMEM is the reason.


> This matches up with the intermittent infinite_recurse failures
> we've been seeing in the buildfarm.  Those are happening across
> a range of systems, but they're (almost) all Linux-based ppc64,
> suggesting that there's a longstanding arch-specific kernel bug
> involved.  For reference, I scraped the attached list of such
> failures in the last three months.  I wonder whether we can get
> the attention of any kernel hackers about that.

Most of them are operated by Mark, right? So it could also just be high
memory pressure on those.
[1;5B

> Anyway, as to what to do about it --- it occurred to me to wonder
> why we are relying on having the signal handlers block and unblock
> signals manually, when we could tell sigaction() that we'd like
> signals blocked.  It is reasonable to expect that the signal support
> is designed to not recursively consume stack space in the face of
> a series of signals, while the way we are doing it clearly opens
> us up to recursive space consumption.  The stack trace I showed
> before proves that the recursion happens at the points where the
> signal handlers unblock signals.

Yea, that seems like it might be good. But we have to be careful too, as
there's some thing were do want to be interruptable from within a signal
handler. We start some processes from within one after all...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: let's make the list of reportable GUCs configurable (was Re: Add%r substitution for psql prompts to show recovery status)
Next
From: Tom Lane
Date:
Subject: Re: stress test for parallel workers