Re: Strange failure on mamba - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Strange failure on mamba
Date
Msg-id 20221118032523.26t4njvrqijapxff@awork3.anarazel.de
Whole thread Raw
In response to Re: Strange failure on mamba  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Sending SIGABRT to child processes (was Re: Strange failure on mamba)
List pgsql-hackers
Hi,

On 2022-11-17 17:47:50 -0500, Tom Lane wrote:
> Yeah, that or some other NetBSD bug could be the explanation, too.
> Without a stack trace it's hard to have any confidence about it,
> but I've been unable to reproduce the problem outside the buildfarm.
> (Which is a familiar refrain.  I wonder what it is about the buildfarm
> environment that makes it act different from the exact same code running
> on the exact same machine.)
> 
> So I'd like to have some way to make the postmaster send SIGABRT instead
> of SIGKILL in the buildfarm environment.  The lowest-tech way would be
> to drive that off some #define or other.  We could scale it up to a GUC
> perhaps.  Adjacent to that, I also wonder whether SIGABRT wouldn't be
> more useful than SIGSTOP for the existing SendStop half-a-feature ---
> the idea that people should collect cores manually seems mighty
> last-century.

I suspect that having a GUC would be a good idea. I needed something similar
recently, debugging an occasional hang in the AIO patchset. I first tried
something like your #define approach and it did cause a problematic flood of
core files.

I ended up using libbacktrace to generate useful backtraces (vs what
backtrace_symbols() generates) when receiving SIGQUIT. I didn't do the legwork
to make it properly signal safe, but it'd be doable afaiu.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: logical decoding and replication of sequences, take 2
Next
From: Richard Guo
Date:
Subject: Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes