Thread: Async-unsafe functions in signal handlers
Hello all, I am going to refactor Greenplum backtraces for error messages and want to make it more compatible with PostgreSQL code.Backtraces in PostgreSQL were introduced by 71a8a4f6e36547bb060dbcc961ea9b57420f7190 commit (original discussion https://www.postgresql.org/message-id/CAMsr+YGL+yfWE=JvbUbnpWtrRZNey7hJ07+zT4bYJdVp4Szdrg@mail.gmail.com) and rely on backtrace()and backtrace_symbols() functions. They are used inside errfinish() that is wrapped by ereport() macros. ereport()is invoked inside bgworker_die() and FloatExceptionHandler() signal handlers. I am confused with this fact - bothbacktrace functions are async-unsafe: backtrace_symbols() - always, backtrace() - only for the first call due to dlopen.I wonder why does PostgreSQL use async-unsafe functions in signal handlers? Best regards, Denis Smirnov | Developer sd@arenadata.io Arenadata | Godovikova 9-17, Moscow 129085 Russia
> 25 авг. 2021 г., в 19:22, Denis Smirnov <sd@arenadata.io> написал(а): > > I am going to refactor Greenplum backtraces for error messages and want to make it more compatible with PostgreSQL code.Backtraces in PostgreSQL were introduced by 71a8a4f6e36547bb060dbcc961ea9b57420f7190 commit (original discussion https://www.postgresql.org/message-id/CAMsr+YGL+yfWE=JvbUbnpWtrRZNey7hJ07+zT4bYJdVp4Szdrg@mail.gmail.com) and rely on backtrace()and backtrace_symbols() functions. They are used inside errfinish() that is wrapped by ereport() macros. ereport()is invoked inside bgworker_die() and FloatExceptionHandler() signal handlers. I am confused with this fact - bothbacktrace functions are async-unsafe: backtrace_symbols() - always, backtrace() - only for the first call due to dlopen.I wonder why does PostgreSQL use async-unsafe functions in signal handlers? In my view GUC backtrace_functions is expected to be used for debug purposes. Not for enabling on production server for bgworker_die()or FloatExceptionHandler(). Are there any way to call backtrace_symbols() without touching backtrace_functions? Best regards, Andrey Borodin.
As far as I understand, the main problem with backtrace_symbols() is the internal malloc() call. Backend can lock foreverif malloc() was interrupted by a signal and then was evaluated again in a signal handler. At the moment Greenplum uses "addr2line -s -e» (on Linux) and "atos -o" (on macOS) for each stack address instead of backtrace_symbols().Both of these utils don’t use malloc() underhood, although there is no guarantee that this implementationnever changes in the future. It seems to be a safer approach, but looks like a dirty hack. > 26 авг. 2021 г., в 08:52, Andrey Borodin <x4mmm@yandex-team.ru> написал(а): > > > >> 25 авг. 2021 г., в 19:22, Denis Smirnov <sd@arenadata.io> написал(а): >> >> I am going to refactor Greenplum backtraces for error messages and want to make it more compatible with PostgreSQL code.Backtraces in PostgreSQL were introduced by 71a8a4f6e36547bb060dbcc961ea9b57420f7190 commit (original discussion https://www.postgresql.org/message-id/CAMsr+YGL+yfWE=JvbUbnpWtrRZNey7hJ07+zT4bYJdVp4Szdrg@mail.gmail.com) and rely on backtrace()and backtrace_symbols() functions. They are used inside errfinish() that is wrapped by ereport() macros. ereport()is invoked inside bgworker_die() and FloatExceptionHandler() signal handlers. I am confused with this fact - bothbacktrace functions are async-unsafe: backtrace_symbols() - always, backtrace() - only for the first call due to dlopen.I wonder why does PostgreSQL use async-unsafe functions in signal handlers? > > In my view GUC backtrace_functions is expected to be used for debug purposes. Not for enabling on production server forbgworker_die() or FloatExceptionHandler(). > Are there any way to call backtrace_symbols() without touching backtrace_functions? > > Best regards, Andrey Borodin. > Best regards, Denis Smirnov | Developer sd@arenadata.io Arenadata | Godovikova 9-17, Moscow 129085 Russia
On Wed, Aug 25, 2021 at 10:22 AM Denis Smirnov <sd@arenadata.io> wrote: > I am going to refactor Greenplum backtraces for error messages and want to make it more compatible with PostgreSQL code.Backtraces in PostgreSQL were introduced by 71a8a4f6e36547bb060dbcc961ea9b57420f7190 commit (original discussion https://www.postgresql.org/message-id/CAMsr+YGL+yfWE=JvbUbnpWtrRZNey7hJ07+zT4bYJdVp4Szdrg@mail.gmail.com) and rely on backtrace()and backtrace_symbols() functions. They are used inside errfinish() that is wrapped by ereport() macros. ereport()is invoked inside bgworker_die() and FloatExceptionHandler() signal handlers. I am confused with this fact - bothbacktrace functions are async-unsafe: backtrace_symbols() - always, backtrace() - only for the first call due to dlopen.I wonder why does PostgreSQL use async-unsafe functions in signal handlers? That is a great question. I think bgworker_die() is extremely dangerous and ought to be removed. I can't see how that can ever be safe. FloatExceptionHandler() is a bit different because in theory the things that could trigger SIGFPE are relatively limited, and in theory we know that those are safe places to ereport(). But I'm not very convinced that this is really true. Among other things, kill -FPE could be executed any time, but even aside from that, I doubt we have exhaustive knowledge of everything in the code that could trigger a floating point exception. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > That is a great question. I think bgworker_die() is extremely > dangerous and ought to be removed. I can't see how that can ever be > safe. Agreed, it looks pretty dangerous from here. The equivalent (and far better battle-tested) signal handlers in postgres.c are a lot more circumspect --- they will call stuff that's unsafe per POSIX, but not from just any interrupt point. (BTW, I think it's pretty silly to imagine that adding backtrace() calls inside ereport is making things any more dangerous. ereport has pretty much always carried a likelihood of calling malloc(), for example.) > FloatExceptionHandler() is a bit different because in theory the > things that could trigger SIGFPE are relatively limited, and in theory > we know that those are safe places to ereport(). But I'm not very > convinced that this is really true. Among other things, kill -FPE > could be executed any time, but even aside from that, I doubt we have > exhaustive knowledge of everything in the code that could trigger a > floating point exception. On the one hand, that's theoretically true, and on the other hand, that code's been like that since the last century and I'm unaware of any actual problems. There are not many places in the backend that do arithmetic that's likely to trigger SIGFPE. Besides which, what's the alternative? I suppose we could SIG_IGN SIGFPE, but that is almost certainly going to create real problems (i.e. missed error cases) while removing only hypothetical ones. (The "DBA randomly issues 'kill -FPE'" scenario can be dismissed, I think --- surely that is in the category of "when you break it, you get to keep both pieces".) The larger subtext here is that just because it's undefined per POSIX doesn't necessarily mean it's unsafe in our use-pattern. regards, tom lane
> 26 авг. 2021 г., в 23:39, Tom Lane <tgl@sss.pgh.pa.us> написал(а): > > (BTW, I think it's pretty silly to imagine that adding backtrace() > calls inside ereport is making things any more dangerous. ereport > has pretty much always carried a likelihood of calling malloc(), > for example.) I have taken a look through the signal handlers and found out that many of them use malloc() via ereport() and elog(). Hereis the list: SIGUSR1 - procsignal_sigusr1_handler(): autoprewarm, autovacuum, bgworker, bgwriter, checkpointer, pgarch, startup, walwriter, walreciever,walsender - sigusr1_handler(): postmaster SIGFPE: - FloatExceptionHandler(): autovacuum, bgworker, postgres, plperl SIGHUP: - SIGHUP_handler(): postmaster SIGCHLD: - reaper(): postmaster SIGQUIT: - quickdie(): postgres SIGTERM: - bgworker_die(): bgworker SIGALRM: - handle_sig_alarm(): autovacuum, bgworker, postmaster, startup, walsender, postgres I suspect there are lots of potential ways to lock on malloc() inside any of this handlers. An interesting question is whythere are still no evidence of such locks? Best regards, Denis Smirnov | Developer sd@arenadata.io Arenadata | Godovikova 9-17, Moscow 129085 Russia
Hi, On 2021-08-27 23:51:27 +0300, Denis Smirnov wrote: > > 26 авг. 2021 г., в 23:39, Tom Lane <tgl@sss.pgh.pa.us> написал(а): > > > > (BTW, I think it's pretty silly to imagine that adding backtrace() > > calls inside ereport is making things any more dangerous. ereport > > has pretty much always carried a likelihood of calling malloc(), > > for example.) > > I have taken a look through the signal handlers and found out that many of them use malloc() via ereport() and elog().Here is the list: > > SIGUSR1 > - procsignal_sigusr1_handler(): autoprewarm, autovacuum, bgworker, bgwriter, checkpointer, pgarch, startup, walwriter,walreciever, walsender There shouldn't be meaningful uses of elog/ereport() inside procsignal_sigusr1_handler(). The exception I found was an elog(FATAL) for unreachable code. > - sigusr1_handler(): postmaster > SIGHUP: > - SIGHUP_handler(): postmaster > SIGCHLD: > - reaper(): postmaster I think these runs in a very controlled set of circumstances because most of postmaster runs with signals masked. > SIGFPE: > - FloatExceptionHandler(): autovacuum, bgworker, postgres, plperl Yep, although as discussed this might not be a "real" problem because it should only run during an instruction triggering an FPE. > SIGQUIT: > - quickdie(): postgres Yes, this is an issue. I've previously argued for handling this via write() and _exit(), instead of the full ereport() machinery. However, we have a bandaid that deals with possible hangs, by SIGKILLing when processes don't shut down (at that point things have already gone quite south, so that's not an issue). > SIGTERM: > - bgworker_die(): bgworker Bad. > SIGALRM: > - handle_sig_alarm(): autovacuum, bgworker, postmaster, startup, walsender, postgres I don't think there reachable elogs in there. I'm not concerned about e.g. elog(FATAL, "timeout index %d out of range 0..%d", index, num_active_timeouts - 1); because that's not something that should ever be reachable in a production scenario. If it is, there's bigger problems. Perhaps we ought to have a version of Assert() that's enabled in production builds as well, and that outputs the error messages via write() and then _exit()s? Greetings, Andres Freund
> 28 авг. 2021 г., в 07:05, Andres Freund <andres@anarazel.de> написал(а): > > However, we have a > bandaid that deals with possible hangs, by SIGKILLing when processes don't > shut down (at that point things have already gone quite south, so that's not > an issue). Thanks for the explanation. I can see that child process SIGKILL machinery was introduced by 82233ce7ea42d6ba519aaec63008aff49da6c7afcommit to fix a malloc() deadlock in quickdie() signal handler. As a result, allchild processes that die too long are killed in the ServerLoop() with SIGKILL. But bgworker_die() is a problem as we initializebgworkers right before ServerLoop(). So we can face malloc() deadlock on postmaster startup (before ServerLoop()started). Maybe we should simply use write() and exit() instead of ereport() for bgworker_die()? Best regards, Denis Smirnov | Developer sd@arenadata.io Arenadata | Godovikova 9-17, Moscow 129085 Russia
Honestly, I don’t know what to do with bgworker_die(). At the moment it produces ereport(FATAL) with async-unsafe proc_exit_prepare() and exit() underhood. I can see three solutions:
1. Leave the code as is. Then SIGTERM can produce deadlocks in bgworker's signal handler. The locked process can terminated with an immediate shutdown of the cluster. May be it is ok as we don’t expect to send SIGTERM to bgworker too often.
2. Use async-safe _exit() in a signal handler instead of proc_exit_prepare() and exit(). In this case we’ll have to go through cluster recovery as the bgworker doesn't properly clean its shared memory. This solution is even worth than immediate shutdown as we recover for every SIGTERM have been sent to bgworker.
3. Set a signal flag inside the handler (something like miscadmin.h XXX_INTERRUPTS() macros). So it becomes an extension developer's responsibility to properly handle this flag in the bgworker’s code. This approach breaks backward compatibility.
Best regards,
Denis Smirnov | Developer
sd@arenadata.io
Arenadata | Godovikova 9-17, Moscow 129085 Russia
Denis Smirnov | Developer
sd@arenadata.io
Arenadata | Godovikova 9-17, Moscow 129085 Russia