Thread: Optionally using a better backtrace library?

Optionally using a better backtrace library?

From
Andres Freund
Date:
Hi,

I like that we now have a builtin backtrace ability. Unfortunately I think the
backtraces are often not very useful, because only externally visible
functions are symbolized.

E.g.:

2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] LOG:  will crash
2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] BACKTRACE:
    postgres: dev assert: andres postgres [local] initializing(errbacktrace+0xbb) [0x562a44c97ca9]
    postgres: dev assert: andres postgres [local] initializing(PostgresMain+0xb6) [0x562a44ac56d4]
    postgres: dev assert: andres postgres [local] initializing(+0x806add) [0x562a449f0add]
    postgres: dev assert: andres postgres [local] initializing(+0x806369) [0x562a449f0369]
    postgres: dev assert: andres postgres [local] initializing(+0x802406) [0x562a449ec406]
    postgres: dev assert: andres postgres [local] initializing(PostmasterMain+0x1676) [0x562a449ebd17]
    postgres: dev assert: andres postgres [local] initializing(+0x6ec2e2) [0x562a448d62e2]
    /lib/x86_64-linux-gnu/libc.so.6(+0x276ca) [0x7f1e820456ca]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f1e82045785]
    postgres: dev assert: andres postgres [local] initializing(_start+0x21) [0x562a445ede21]

which is far from as useful as it could be.


A lot of platforms have "libbacktrace" available, e.g. as part of gcc. I think
we should consider using it, when available, to produce more useful
backtraces.

I hacked it up for ereport() to debug something, and the backtraces are
considerably better:

2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG:  will crash
2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE:
    [0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126
    [0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461
    [0x55fcd0310dd8] BackendStartup: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189
    [0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779
    [0x55fcd030c786] PostmasterMain: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1463
    [0x55fcd01f6d51] main: ../../../../home/andres/src/postgresql/src/backend/main/main.c:198
    [0x7fdd914456c9] __libc_start_call_main: ../sysdeps/nptl/libc_start_call_main.h:58
    [0x7fdd91445784] __libc_start_main_impl: ../csu/libc-start.c:360
    [0x55fccff0e890] [unknown]: [unknown]:0

The way each frame looks is my fault, not libbacktrace's...

Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:

https://github.com/ianlancetaylor/libbacktrace
  As of October 2020, libbacktrace supports ELF, PE/COFF, Mach-O, and XCOFF
  executables with DWARF debugging information. In other words, it supports
  GNU/Linux, *BSD, macOS, Windows, and AIX. The library is written to make it
  straightforward to add support for other object file and debugging formats.


The state I currently have is very hacky, but if there's interest in
upstreaming something like this, I could clean it up.

Greetings,

Andres Freund



Re: Optionally using a better backtrace library?

From
Pavel Stehule
Date:


ne 2. 7. 2023 v 20:32 odesílatel Andres Freund <andres@anarazel.de> napsal:
Hi,

I like that we now have a builtin backtrace ability. Unfortunately I think the
backtraces are often not very useful, because only externally visible
functions are symbolized.

E.g.:

2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] LOG:  will crash
2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] BACKTRACE:
        postgres: dev assert: andres postgres [local] initializing(errbacktrace+0xbb) [0x562a44c97ca9]
        postgres: dev assert: andres postgres [local] initializing(PostgresMain+0xb6) [0x562a44ac56d4]
        postgres: dev assert: andres postgres [local] initializing(+0x806add) [0x562a449f0add]
        postgres: dev assert: andres postgres [local] initializing(+0x806369) [0x562a449f0369]
        postgres: dev assert: andres postgres [local] initializing(+0x802406) [0x562a449ec406]
        postgres: dev assert: andres postgres [local] initializing(PostmasterMain+0x1676) [0x562a449ebd17]
        postgres: dev assert: andres postgres [local] initializing(+0x6ec2e2) [0x562a448d62e2]
        /lib/x86_64-linux-gnu/libc.so.6(+0x276ca) [0x7f1e820456ca]
        /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f1e82045785]
        postgres: dev assert: andres postgres [local] initializing(_start+0x21) [0x562a445ede21]

which is far from as useful as it could be.


A lot of platforms have "libbacktrace" available, e.g. as part of gcc. I think
we should consider using it, when available, to produce more useful
backtraces.

I hacked it up for ereport() to debug something, and the backtraces are
considerably better:

2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG:  will crash
2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE:
        [0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126
        [0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461
        [0x55fcd0310dd8] BackendStartup: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189
        [0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779
        [0x55fcd030c786] PostmasterMain: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1463
        [0x55fcd01f6d51] main: ../../../../home/andres/src/postgresql/src/backend/main/main.c:198
        [0x7fdd914456c9] __libc_start_call_main: ../sysdeps/nptl/libc_start_call_main.h:58
        [0x7fdd91445784] __libc_start_main_impl: ../csu/libc-start.c:360
        [0x55fccff0e890] [unknown]: [unknown]:0

The way each frame looks is my fault, not libbacktrace's...

Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:

https://github.com/ianlancetaylor/libbacktrace
  As of October 2020, libbacktrace supports ELF, PE/COFF, Mach-O, and XCOFF
  executables with DWARF debugging information. In other words, it supports
  GNU/Linux, *BSD, macOS, Windows, and AIX. The library is written to make it
  straightforward to add support for other object file and debugging formats.


The state I currently have is very hacky, but if there's interest in
upstreaming something like this, I could clean it up.

Looks nice

+1

Pavel


Greetings,

Andres Freund


Re: Optionally using a better backtrace library?

From
Joe Conway
Date:
On 7/2/23 14:31, Andres Freund wrote:
> Nice things about libbacktrace are that the generation of stack traces is
> documented to be async signal safe on most platforms (with a #define to figure
> that out, and a more minimal safe version always available) and that it
> supports a wide range of platforms:
> 
> https://github.com/ianlancetaylor/libbacktrace
>    As of October 2020, libbacktrace supports ELF, PE/COFF, Mach-O, and XCOFF
>    executables with DWARF debugging information. In other words, it supports
>    GNU/Linux, *BSD, macOS, Windows, and AIX. The library is written to make it
>    straightforward to add support for other object file and debugging formats.
> 
> 
> The state I currently have is very hacky, but if there's interest in
> upstreaming something like this, I could clean it up.

+1
Seems useful!

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Optionally using a better backtrace library?

From
Kyotaro Horiguchi
Date:
At Sun, 2 Jul 2023 11:31:56 -0700, Andres Freund <andres@anarazel.de> wrote in 
> The state I currently have is very hacky, but if there's interest in
> upstreaming something like this, I could clean it up.

I can't help voting +1.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Optionally using a better backtrace library?

From
Alvaro Herrera
Date:
Hello,

On 2023-Jul-02, Andres Freund wrote:

> I like that we now have a builtin backtrace ability. Unfortunately I think the
> backtraces are often not very useful, because only externally visible
> functions are symbolized.

Agreed, these backtraces are pretty close to useless.  Not completely,
but I haven't found a practical way to use them for actual debugging
of production problems.

> I hacked it up for ereport() to debug something, and the backtraces are
> considerably better:
> 
> 2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG:  will crash
> 2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE:
>     [0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126
>     [0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461
>     [0x55fcd0310dd8] BackendStartup: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189
>     [0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779

Yeah, this looks much more usable.

> Nice things about libbacktrace are that the generation of stack traces is
> documented to be async signal safe on most platforms (with a #define to figure
> that out, and a more minimal safe version always available) and that it
> supports a wide range of platforms:

Sadly, it looks like the library is seldom distributed.  For example,
Debian seems to only have a package called android-libbacktrace which I
imagine is not what we want.  On my system I see a static library only
-- is that enough?  That file is part of package libgcc-10-dev, which
tells me that we can't depend on that for packaging purposes.

I think it's pretty much the same in the RPM side of the world.

So the only way to get this into customer systems would be to include
the library in our packages.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)



Re: Optionally using a better backtrace library?

From
David Steele
Date:
On 7/3/23 11:58, Alvaro Herrera wrote:
> 
>> Nice things about libbacktrace are that the generation of stack traces is
>> documented to be async signal safe on most platforms (with a #define to figure
>> that out, and a more minimal safe version always available) and that it
>> supports a wide range of platforms:
> 
> Sadly, it looks like the library is seldom distributed.  For example,
> Debian seems to only have a package called android-libbacktrace which I
> imagine is not what we want.  On my system I see a static library only
> -- is that enough?  That file is part of package libgcc-10-dev, which
> tells me that we can't depend on that for packaging purposes.

It would be a pretty big win even if the improved backtrace is only 
available in dev environments -- this is what pgBackRest currently does.

We are also considering adding this library to production builds but 
have not pulled the trigger on that yet since we are a bit worried about 
possible performance impact and have not had time to benchmark.

Regards,
-David



Re: Optionally using a better backtrace library?

From
Peter Eisentraut
Date:
On 02.07.23 20:31, Andres Freund wrote:
> A lot of platforms have "libbacktrace" available, e.g. as part of gcc. I think
> we should consider using it, when available, to produce more useful
> backtraces.
> 
> I hacked it up for ereport() to debug something, and the backtraces are
> considerably better:

Makes sense.  When we first added backtrace support, we considered 
libunwind, which didn't really give better backtraces than the built-in 
stuff, so it wasn't worth dealing with an additional dependency.




Re: Optionally using a better backtrace library?

From
Andres Freund
Date:
Hi,

On 2023-07-03 11:58:25 +0200, Alvaro Herrera wrote:
> On 2023-Jul-02, Andres Freund wrote:
> > I like that we now have a builtin backtrace ability. Unfortunately I think the
> > backtraces are often not very useful, because only externally visible
> > functions are symbolized.
> 
> Agreed, these backtraces are pretty close to useless.  Not completely,
> but I haven't found a practical way to use them for actual debugging
> of production problems.

Yea. And I've grown pretty tired asking people to break out gdb in production
scenarios :/


> > Nice things about libbacktrace are that the generation of stack traces is
> > documented to be async signal safe on most platforms (with a #define to figure
> > that out, and a more minimal safe version always available) and that it
> > supports a wide range of platforms:
> 
> Sadly, it looks like the library is seldom distributed.

It's often distributed as part of gcc.


> For example, Debian seems to only have a package called android-libbacktrace
> which I imagine is not what we want.

Indeed not.


> On my system I see a static library only -- is that enough?  That file is
> part of package libgcc-10-dev, which tells me that we can't depend on that
> for packaging purposes.

We should be able to depend on that gcc-NN depends on libgcc-NN-dev, it
contains all the compiler version specific stuff. It's where the intrinsics
headers, C runtime initialization, sanitizer libraries all live.  clang will
typically also depend on libgcc-NN-dev on unixoid systems.

And since it's statically linked (and needs to be apparently), you don't need
libgcc-NN-dev installed at runtime.


> I think it's pretty much the same in the RPM side of the world.

I don't know much about that side of the world...

Greetings,

Andres Freund



Re: Optionally using a better backtrace library?

From
"Tristan Partin"
Date:
On Mon Jul 3, 2023 at 12:43 PM CDT, Andres Freund wrote:
> On 2023-07-03 11:58:25 +0200, Alvaro Herrera wrote:
> > On 2023-Jul-02, Andres Freund wrote:
> > > Nice things about libbacktrace are that the generation of stack traces is
> > > documented to be async signal safe on most platforms (with a #define to figure
> > > that out, and a more minimal safe version always available) and that it
> > > supports a wide range of platforms:
> >
> > Sadly, it looks like the library is seldom distributed.
>
> It's often distributed as part of gcc.
>
>
> > For example, Debian seems to only have a package called android-libbacktrace
> > which I imagine is not what we want.
>
> Indeed not.
>
>
> > On my system I see a static library only -- is that enough?  That file is
> > part of package libgcc-10-dev, which tells me that we can't depend on that
> > for packaging purposes.
>
> We should be able to depend on that gcc-NN depends on libgcc-NN-dev, it
> contains all the compiler version specific stuff. It's where the intrinsics
> headers, C runtime initialization, sanitizer libraries all live.  clang will
> typically also depend on libgcc-NN-dev on unixoid systems.
>
> And since it's statically linked (and needs to be apparently), you don't need
> libgcc-NN-dev installed at runtime.
>
>
> > I think it's pretty much the same in the RPM side of the world.
>
> I don't know much about that side of the world...

I could not find this packaged in Fedora. I did find it in FreeBSD
however. We could add libbacktrace as a Meson subproject.

--
Tristan Partin
Neon (https://neon.tech)



Re: Optionally using a better backtrace library?

From
Noah Misch
Date:
On Mon, Jul 03, 2023 at 11:58:25AM +0200, Alvaro Herrera wrote:
> On 2023-Jul-02, Andres Freund wrote:
> > I like that we now have a builtin backtrace ability. Unfortunately I think the
> > backtraces are often not very useful, because only externally visible
> > functions are symbolized.
> 
> Agreed, these backtraces are pretty close to useless.  Not completely,
> but I haven't found a practical way to use them for actual debugging
> of production problems.

For what it's worth, I use the attached script to convert the current
errbacktrace output to a fully-symbolized backtrace.  Nonetheless, ...

> > I hacked it up for ereport() to debug something, and the backtraces are
> > considerably better:
> > 
> > 2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG:  will crash
> > 2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE:
> >     [0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126
> >     [0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461
> >     [0x55fcd0310dd8] BackendStartup:
../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189
> >     [0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779
> 
> Yeah, this looks much more usable.

... +1 for offering this.

Attachment

Re: Optionally using a better backtrace library?

From
Alvaro Herrera
Date:
On 2023-Sep-04, Noah Misch wrote:

> On Mon, Jul 03, 2023 at 11:58:25AM +0200, Alvaro Herrera wrote:

> > Agreed, these backtraces are pretty close to useless.  Not completely,
> > but I haven't found a practical way to use them for actual debugging
> > of production problems.
> 
> For what it's worth, I use the attached script to convert the current
> errbacktrace output to a fully-symbolized backtrace.

Much appreciated!  I can put this to good use.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Optionally using a better backtrace library?

From
Peter Geoghegan
Date:
On Tue, Sep 5, 2023 at 2:59 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> Much appreciated!  I can put this to good use.

I was just reminded of how our existing backtrace support is lacklustre.

Are you planning on submitting a patch for this?

--
Peter Geoghegan