Re: Printing backtrace of postgres processes - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Printing backtrace of postgres processes
Date
Msg-id CALj2ACUNZVB0cQovvKBd53-upsMur8j-5_K=-fg86uAa+WYEWg@mail.gmail.com
Whole thread Raw
In response to Re: Printing backtrace of postgres processes  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Printing backtrace of postgres processes
List pgsql-hackers
On Fri, Nov 11, 2022 at 7:59 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Thu, 10 Nov 2022 15:56:35 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
> > On Mon, Apr 18, 2022 at 9:10 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > The attached v21 patch has the changes for the same.
> >
> > Thanks for the patch. Here are some comments:
> >
> > 1. I think there's a fundamental problem with this patch, that is, it
> > prints the backtrace when the interrupt is processed but not when
> > interrupt is received. This way, the backends/aux processes will
>
> Yeah, but the obstacle was backtrace(3) itself. Andres pointed [1]
> that that may be doable with some care (and I agree to the opinion).
> AFAIS no discussions followed and things have been to the current
> shape since then.
>
>
> [1] https://www.postgresql.org/message-id/20201201032649.aekv5b5dicvmovf4%40alap3.anarazel.de
> | > Surely this is *utterly* unsafe.  You can't do that sort of stuff in
> | > a signal handler.
> |
> | That's of course true for the current implementation - but I don't think
> | it's a fundamental constraint. With a bit of care backtrace() and
> | backtrace_symbols() itself can be signal safe:
>
> man 3 backtrace
> >  *  backtrace()  and  backtrace_symbols_fd() don't call malloc() explic‐
> >     itly, but they are part of libgcc,  which  gets  loaded  dynamically
> >     when  first  used.   Dynamic loading usually triggers a call to mal‐
> >     loc(3).  If you need certain calls to these  two  functions  to  not
> >     allocate  memory (in signal handlers, for example), you need to make
> >     sure libgcc is loaded beforehand.

I missed that part. Thanks for pointing it out. The
backtrace_symbols() seems to be returning a malloc'ed array [1],
meaning it can't be used in signal handlers (if used, it can cause
deadlocks as per [2]) and existing set_backtrace() is using it.
Therefore, we need to either change set_backtrace() to use
backtrace_symbols_fd() instead of backtrace_symobls() or introduce
another function for the purpose of this feature. If done that, then
we can think of preloading of libgcc which makes backtrace(),
backtrace_symobols_fd() safe to use in signal handlers.

Looks like we're not loading libgcc explicitly now into any of
postgres processes, please correct me if I'm wrong here. If we're not
loading it right now, is it acceptable to load libgcc into every
postgres process for the sake of this feature?

[1] https://linux.die.net/man/3/backtrace_symbols
[2] https://stackoverflow.com/questions/40049751/malloc-inside-linux-signal-handler-cause-deadlock

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Maxim Orlov
Date:
Subject: Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures
Next
From: Pavel Borisov
Date:
Subject: Re: Lockless queue of waiters in LWLock