Thread: Optionally using a better backtrace library?
Hi, I like that we now have a builtin backtrace ability. Unfortunately I think the backtraces are often not very useful, because only externally visible functions are symbolized. E.g.: 2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] LOG: will crash 2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] BACKTRACE: postgres: dev assert: andres postgres [local] initializing(errbacktrace+0xbb) [0x562a44c97ca9] postgres: dev assert: andres postgres [local] initializing(PostgresMain+0xb6) [0x562a44ac56d4] postgres: dev assert: andres postgres [local] initializing(+0x806add) [0x562a449f0add] postgres: dev assert: andres postgres [local] initializing(+0x806369) [0x562a449f0369] postgres: dev assert: andres postgres [local] initializing(+0x802406) [0x562a449ec406] postgres: dev assert: andres postgres [local] initializing(PostmasterMain+0x1676) [0x562a449ebd17] postgres: dev assert: andres postgres [local] initializing(+0x6ec2e2) [0x562a448d62e2] /lib/x86_64-linux-gnu/libc.so.6(+0x276ca) [0x7f1e820456ca] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f1e82045785] postgres: dev assert: andres postgres [local] initializing(_start+0x21) [0x562a445ede21] which is far from as useful as it could be. A lot of platforms have "libbacktrace" available, e.g. as part of gcc. I think we should consider using it, when available, to produce more useful backtraces. I hacked it up for ereport() to debug something, and the backtraces are considerably better: 2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG: will crash 2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE: [0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126 [0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461 [0x55fcd0310dd8] BackendStartup: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189 [0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779 [0x55fcd030c786] PostmasterMain: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1463 [0x55fcd01f6d51] main: ../../../../home/andres/src/postgresql/src/backend/main/main.c:198 [0x7fdd914456c9] __libc_start_call_main: ../sysdeps/nptl/libc_start_call_main.h:58 [0x7fdd91445784] __libc_start_main_impl: ../csu/libc-start.c:360 [0x55fccff0e890] [unknown]: [unknown]:0 The way each frame looks is my fault, not libbacktrace's... Nice things about libbacktrace are that the generation of stack traces is documented to be async signal safe on most platforms (with a #define to figure that out, and a more minimal safe version always available) and that it supports a wide range of platforms: https://github.com/ianlancetaylor/libbacktrace As of October 2020, libbacktrace supports ELF, PE/COFF, Mach-O, and XCOFF executables with DWARF debugging information. In other words, it supports GNU/Linux, *BSD, macOS, Windows, and AIX. The library is written to make it straightforward to add support for other object file and debugging formats. The state I currently have is very hacky, but if there's interest in upstreaming something like this, I could clean it up. Greetings, Andres Freund
ne 2. 7. 2023 v 20:32 odesílatel Andres Freund <andres@anarazel.de> napsal:
Hi,
I like that we now have a builtin backtrace ability. Unfortunately I think the
backtraces are often not very useful, because only externally visible
functions are symbolized.
E.g.:
2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] LOG: will crash
2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] BACKTRACE:
postgres: dev assert: andres postgres [local] initializing(errbacktrace+0xbb) [0x562a44c97ca9]
postgres: dev assert: andres postgres [local] initializing(PostgresMain+0xb6) [0x562a44ac56d4]
postgres: dev assert: andres postgres [local] initializing(+0x806add) [0x562a449f0add]
postgres: dev assert: andres postgres [local] initializing(+0x806369) [0x562a449f0369]
postgres: dev assert: andres postgres [local] initializing(+0x802406) [0x562a449ec406]
postgres: dev assert: andres postgres [local] initializing(PostmasterMain+0x1676) [0x562a449ebd17]
postgres: dev assert: andres postgres [local] initializing(+0x6ec2e2) [0x562a448d62e2]
/lib/x86_64-linux-gnu/libc.so.6(+0x276ca) [0x7f1e820456ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f1e82045785]
postgres: dev assert: andres postgres [local] initializing(_start+0x21) [0x562a445ede21]
which is far from as useful as it could be.
A lot of platforms have "libbacktrace" available, e.g. as part of gcc. I think
we should consider using it, when available, to produce more useful
backtraces.
I hacked it up for ereport() to debug something, and the backtraces are
considerably better:
2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG: will crash
2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE:
[0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126
[0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461
[0x55fcd0310dd8] BackendStartup: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189
[0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779
[0x55fcd030c786] PostmasterMain: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1463
[0x55fcd01f6d51] main: ../../../../home/andres/src/postgresql/src/backend/main/main.c:198
[0x7fdd914456c9] __libc_start_call_main: ../sysdeps/nptl/libc_start_call_main.h:58
[0x7fdd91445784] __libc_start_main_impl: ../csu/libc-start.c:360
[0x55fccff0e890] [unknown]: [unknown]:0
The way each frame looks is my fault, not libbacktrace's...
Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:
https://github.com/ianlancetaylor/libbacktrace
As of October 2020, libbacktrace supports ELF, PE/COFF, Mach-O, and XCOFF
executables with DWARF debugging information. In other words, it supports
GNU/Linux, *BSD, macOS, Windows, and AIX. The library is written to make it
straightforward to add support for other object file and debugging formats.
The state I currently have is very hacky, but if there's interest in
upstreaming something like this, I could clean it up.
Looks nice
+1
Pavel
Greetings,
Andres Freund
On 7/2/23 14:31, Andres Freund wrote: > Nice things about libbacktrace are that the generation of stack traces is > documented to be async signal safe on most platforms (with a #define to figure > that out, and a more minimal safe version always available) and that it > supports a wide range of platforms: > > https://github.com/ianlancetaylor/libbacktrace > As of October 2020, libbacktrace supports ELF, PE/COFF, Mach-O, and XCOFF > executables with DWARF debugging information. In other words, it supports > GNU/Linux, *BSD, macOS, Windows, and AIX. The library is written to make it > straightforward to add support for other object file and debugging formats. > > > The state I currently have is very hacky, but if there's interest in > upstreaming something like this, I could clean it up. +1 Seems useful! -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
At Sun, 2 Jul 2023 11:31:56 -0700, Andres Freund <andres@anarazel.de> wrote in > The state I currently have is very hacky, but if there's interest in > upstreaming something like this, I could clean it up. I can't help voting +1. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Hello, On 2023-Jul-02, Andres Freund wrote: > I like that we now have a builtin backtrace ability. Unfortunately I think the > backtraces are often not very useful, because only externally visible > functions are symbolized. Agreed, these backtraces are pretty close to useless. Not completely, but I haven't found a practical way to use them for actual debugging of production problems. > I hacked it up for ereport() to debug something, and the backtraces are > considerably better: > > 2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG: will crash > 2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE: > [0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126 > [0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461 > [0x55fcd0310dd8] BackendStartup: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189 > [0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779 Yeah, this looks much more usable. > Nice things about libbacktrace are that the generation of stack traces is > documented to be async signal safe on most platforms (with a #define to figure > that out, and a more minimal safe version always available) and that it > supports a wide range of platforms: Sadly, it looks like the library is seldom distributed. For example, Debian seems to only have a package called android-libbacktrace which I imagine is not what we want. On my system I see a static library only -- is that enough? That file is part of package libgcc-10-dev, which tells me that we can't depend on that for packaging purposes. I think it's pretty much the same in the RPM side of the world. So the only way to get this into customer systems would be to include the library in our packages. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Doing what he did amounts to sticking his fingers under the hood of the implementation; if he gets his fingers burnt, it's his problem." (Tom Lane)
On 7/3/23 11:58, Alvaro Herrera wrote: > >> Nice things about libbacktrace are that the generation of stack traces is >> documented to be async signal safe on most platforms (with a #define to figure >> that out, and a more minimal safe version always available) and that it >> supports a wide range of platforms: > > Sadly, it looks like the library is seldom distributed. For example, > Debian seems to only have a package called android-libbacktrace which I > imagine is not what we want. On my system I see a static library only > -- is that enough? That file is part of package libgcc-10-dev, which > tells me that we can't depend on that for packaging purposes. It would be a pretty big win even if the improved backtrace is only available in dev environments -- this is what pgBackRest currently does. We are also considering adding this library to production builds but have not pulled the trigger on that yet since we are a bit worried about possible performance impact and have not had time to benchmark. Regards, -David
On 02.07.23 20:31, Andres Freund wrote: > A lot of platforms have "libbacktrace" available, e.g. as part of gcc. I think > we should consider using it, when available, to produce more useful > backtraces. > > I hacked it up for ereport() to debug something, and the backtraces are > considerably better: Makes sense. When we first added backtrace support, we considered libunwind, which didn't really give better backtraces than the built-in stuff, so it wasn't worth dealing with an additional dependency.
Hi, On 2023-07-03 11:58:25 +0200, Alvaro Herrera wrote: > On 2023-Jul-02, Andres Freund wrote: > > I like that we now have a builtin backtrace ability. Unfortunately I think the > > backtraces are often not very useful, because only externally visible > > functions are symbolized. > > Agreed, these backtraces are pretty close to useless. Not completely, > but I haven't found a practical way to use them for actual debugging > of production problems. Yea. And I've grown pretty tired asking people to break out gdb in production scenarios :/ > > Nice things about libbacktrace are that the generation of stack traces is > > documented to be async signal safe on most platforms (with a #define to figure > > that out, and a more minimal safe version always available) and that it > > supports a wide range of platforms: > > Sadly, it looks like the library is seldom distributed. It's often distributed as part of gcc. > For example, Debian seems to only have a package called android-libbacktrace > which I imagine is not what we want. Indeed not. > On my system I see a static library only -- is that enough? That file is > part of package libgcc-10-dev, which tells me that we can't depend on that > for packaging purposes. We should be able to depend on that gcc-NN depends on libgcc-NN-dev, it contains all the compiler version specific stuff. It's where the intrinsics headers, C runtime initialization, sanitizer libraries all live. clang will typically also depend on libgcc-NN-dev on unixoid systems. And since it's statically linked (and needs to be apparently), you don't need libgcc-NN-dev installed at runtime. > I think it's pretty much the same in the RPM side of the world. I don't know much about that side of the world... Greetings, Andres Freund
On Mon Jul 3, 2023 at 12:43 PM CDT, Andres Freund wrote: > On 2023-07-03 11:58:25 +0200, Alvaro Herrera wrote: > > On 2023-Jul-02, Andres Freund wrote: > > > Nice things about libbacktrace are that the generation of stack traces is > > > documented to be async signal safe on most platforms (with a #define to figure > > > that out, and a more minimal safe version always available) and that it > > > supports a wide range of platforms: > > > > Sadly, it looks like the library is seldom distributed. > > It's often distributed as part of gcc. > > > > For example, Debian seems to only have a package called android-libbacktrace > > which I imagine is not what we want. > > Indeed not. > > > > On my system I see a static library only -- is that enough? That file is > > part of package libgcc-10-dev, which tells me that we can't depend on that > > for packaging purposes. > > We should be able to depend on that gcc-NN depends on libgcc-NN-dev, it > contains all the compiler version specific stuff. It's where the intrinsics > headers, C runtime initialization, sanitizer libraries all live. clang will > typically also depend on libgcc-NN-dev on unixoid systems. > > And since it's statically linked (and needs to be apparently), you don't need > libgcc-NN-dev installed at runtime. > > > > I think it's pretty much the same in the RPM side of the world. > > I don't know much about that side of the world... I could not find this packaged in Fedora. I did find it in FreeBSD however. We could add libbacktrace as a Meson subproject. -- Tristan Partin Neon (https://neon.tech)
On Mon, Jul 03, 2023 at 11:58:25AM +0200, Alvaro Herrera wrote: > On 2023-Jul-02, Andres Freund wrote: > > I like that we now have a builtin backtrace ability. Unfortunately I think the > > backtraces are often not very useful, because only externally visible > > functions are symbolized. > > Agreed, these backtraces are pretty close to useless. Not completely, > but I haven't found a practical way to use them for actual debugging > of production problems. For what it's worth, I use the attached script to convert the current errbacktrace output to a fully-symbolized backtrace. Nonetheless, ... > > I hacked it up for ereport() to debug something, and the backtraces are > > considerably better: > > > > 2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG: will crash > > 2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE: > > [0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126 > > [0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461 > > [0x55fcd0310dd8] BackendStartup: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189 > > [0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779 > > Yeah, this looks much more usable. ... +1 for offering this.
Attachment
On 2023-Sep-04, Noah Misch wrote: > On Mon, Jul 03, 2023 at 11:58:25AM +0200, Alvaro Herrera wrote: > > Agreed, these backtraces are pretty close to useless. Not completely, > > but I haven't found a practical way to use them for actual debugging > > of production problems. > > For what it's worth, I use the attached script to convert the current > errbacktrace output to a fully-symbolized backtrace. Much appreciated! I can put this to good use. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On Tue, Sep 5, 2023 at 2:59 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Much appreciated! I can put this to good use. I was just reminded of how our existing backtrace support is lacklustre. Are you planning on submitting a patch for this? -- Peter Geoghegan