Hi,
On 2022-11-17 17:47:50 -0500, Tom Lane wrote:
> Yeah, that or some other NetBSD bug could be the explanation, too.
> Without a stack trace it's hard to have any confidence about it,
> but I've been unable to reproduce the problem outside the buildfarm.
> (Which is a familiar refrain. I wonder what it is about the buildfarm
> environment that makes it act different from the exact same code running
> on the exact same machine.)
>
> So I'd like to have some way to make the postmaster send SIGABRT instead
> of SIGKILL in the buildfarm environment. The lowest-tech way would be
> to drive that off some #define or other. We could scale it up to a GUC
> perhaps. Adjacent to that, I also wonder whether SIGABRT wouldn't be
> more useful than SIGSTOP for the existing SendStop half-a-feature ---
> the idea that people should collect cores manually seems mighty
> last-century.
I suspect that having a GUC would be a good idea. I needed something similar
recently, debugging an occasional hang in the AIO patchset. I first tried
something like your #define approach and it did cause a problematic flood of
core files.
I ended up using libbacktrace to generate useful backtraces (vs what
backtrace_symbols() generates) when receiving SIGQUIT. I didn't do the legwork
to make it properly signal safe, but it'd be doable afaiu.
Greetings,
Andres Freund