Hi,
Occasionally I see core dumps for sh, cp etc when running the tests. I think
this is mainly due to immediate shutdowns / crashes signalling the entire
process group with SIGQUIT. If a sh/cp/... is running as part of an
archive/restore command when the signal arrives, we'll trigger a coredump,
because those tools won't have a SIGQUIT handler.
ISTM that postmaster's signal_child() shouldn't send SIGQUIT to the process
group in the #ifdef HAVE_SETSID section. We've already signalled the backend
with SIGQUIT, so we could change the signal we send to the whole process group
to one that doesn't trigger core dumps by default. SIGTERM seems like it would
be the right choice.
The one non-trivial aspect of this is that that signal will also be delivered
to the group leader. It's possible that that could lead to some minor test
behaviour issues, because the output could change if e.g. SIGTERM is received
/ processed first.
Greetings,
Andres Freund