Hi,
On 2021-08-27 23:51:27 +0300, Denis Smirnov wrote:
> > 26 авг. 2021 г., в 23:39, Tom Lane <tgl@sss.pgh.pa.us> написал(а):
> >
> > (BTW, I think it's pretty silly to imagine that adding backtrace()
> > calls inside ereport is making things any more dangerous. ereport
> > has pretty much always carried a likelihood of calling malloc(),
> > for example.)
>
> I have taken a look through the signal handlers and found out that many of them use malloc() via ereport() and
elog().Here is the list:
>
> SIGUSR1
> - procsignal_sigusr1_handler(): autoprewarm, autovacuum, bgworker, bgwriter, checkpointer, pgarch, startup,
walwriter,walreciever, walsender
There shouldn't be meaningful uses of elog/ereport() inside
procsignal_sigusr1_handler(). The exception I found was an elog(FATAL) for
unreachable code.
> - sigusr1_handler(): postmaster
> SIGHUP:
> - SIGHUP_handler(): postmaster
> SIGCHLD:
> - reaper(): postmaster
I think these runs in a very controlled set of circumstances because most of
postmaster runs with signals masked.
> SIGFPE:
> - FloatExceptionHandler(): autovacuum, bgworker, postgres, plperl
Yep, although as discussed this might not be a "real" problem because it
should only run during an instruction triggering an FPE.
> SIGQUIT:
> - quickdie(): postgres
Yes, this is an issue. I've previously argued for handling this via write()
and _exit(), instead of the full ereport() machinery. However, we have a
bandaid that deals with possible hangs, by SIGKILLing when processes don't
shut down (at that point things have already gone quite south, so that's not
an issue).
> SIGTERM:
> - bgworker_die(): bgworker
Bad.
> SIGALRM:
> - handle_sig_alarm(): autovacuum, bgworker, postmaster, startup, walsender, postgres
I don't think there reachable elogs in there. I'm not concerned about e.g.
elog(FATAL, "timeout index %d out of range 0..%d", index,
num_active_timeouts - 1);
because that's not something that should ever be reachable in a production
scenario. If it is, there's bigger problems.
Perhaps we ought to have a version of Assert() that's enabled in production
builds as well, and that outputs the error messages via write() and then
_exit()s?
Greetings,
Andres Freund