Re: stress test for parallel workers - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: stress test for parallel workers
Date
Msg-id CA+hUKGKNgufn12Uh4iEh5y=JkEnUBWnLLmi8L4zwzMunFeKwSA@mail.gmail.com
Whole thread Raw
In response to Re: stress test for parallel workers  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: stress test for parallel workers
Re: stress test for parallel workers
List pgsql-hackers
On Sat, Oct 12, 2019 at 7:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> This matches up with the intermittent infinite_recurse failures
> we've been seeing in the buildfarm.  Those are happening across
> a range of systems, but they're (almost) all Linux-based ppc64,
> suggesting that there's a longstanding arch-specific kernel bug
> involved.  For reference, I scraped the attached list of such
> failures in the last three months.  I wonder whether we can get
> the attention of any kernel hackers about that.

Yeah, I don't know anything about this stuff, but I was also beginning
to wonder if something is busted in the arch-specific fault.c code
that checks if stack expansion is valid[1], in a way that fails with a
rapidly growing stack, well timed incoming signals, and perhaps
Docker/LXC (that's on Mark's systems IIUC, not sure about the ARM
boxes that failed or if it could be relevant here).  Perhaps the
arbitrary tolerances mentioned in that comment are relevant.

[1] https://github.com/torvalds/linux/blob/master/arch/powerpc/mm/fault.c#L244



pgsql-hackers by date:

Previous
From: Kyle Bateman
Date:
Subject: Re: Connect as multiple users using single client certificate
Next
From: Tom Lane
Date:
Subject: Re: Connect as multiple users using single client certificate