We shouldn't signal process groups with SIGQUIT - Mailing list pgsql-hackers

From Andres Freund
Subject We shouldn't signal process groups with SIGQUIT
Date
Msg-id 20230214202927.xgb2w6b7gnhq6tvv@awork3.anarazel.de
Whole thread Raw
Responses Re: We shouldn't signal process groups with SIGQUIT
List pgsql-hackers
Hi,

The default reaction to SIGQUIT is to create core dumps. We use SIGQUIT to
implement immediate shutdowns. We send the signal to the entire process group.

The result of that is that we regularly produce core dumps for binaries like
sh/cp. I regularly see this on my local system, I've seen it on CI. Recently
Thomas added logic to show core dumps happing in cfbot ([1]). Plenty unrelated
core dumps, but also lots in sh/cp ([2]).

We found a bunch of issues as part of [3], but I think the issue I'm
discussing here is separate.


ISTM that signal_child() should downgrade SIGQUIT to SIGTERM when sending to
the process group. That way we'd maintain the current behaviour for postgres
itself, but stop core-dumping archive/restore scripts (as well as other
subprocesses that e.g. trusted PLs might create).


Makes sense?


Greetings,

Andres Freund


[1] http://cfbot.cputube.org/highlights/core.html

[2] A small sample:
https://api.cirrus-ci.com/v1/task/5939902693507072/logs/cores.log
https://api.cirrus-ci.com/v1/task/5549174150660096/logs/cores.log
https://api.cirrus-ci.com/v1/task/6153817767542784/logs/cores.log
https://api.cirrus-ci.com/v1/task/6567335205535744/logs/cores.log
https://api.cirrus-ci.com/v1/task/4804998119292928/logs/cores.log

[3] https://postgr.es/m/Y9nGDSgIm83FHcad%40paquier.xyz



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Possible false valgrind error reports
Next
From: Tom Lane
Date:
Subject: Re: We shouldn't signal process groups with SIGQUIT