Optionally using a better backtrace library?

Started by Andres Freundover 2 years ago12 messages
#1Andres Freund
andres@anarazel.de

Hi,

I like that we now have a builtin backtrace ability. Unfortunately I think the
backtraces are often not very useful, because only externally visible
functions are symbolized.

E.g.:

2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] LOG: will crash
2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] BACKTRACE:
postgres: dev assert: andres postgres [local] initializing(errbacktrace+0xbb) [0x562a44c97ca9]
postgres: dev assert: andres postgres [local] initializing(PostgresMain+0xb6) [0x562a44ac56d4]
postgres: dev assert: andres postgres [local] initializing(+0x806add) [0x562a449f0add]
postgres: dev assert: andres postgres [local] initializing(+0x806369) [0x562a449f0369]
postgres: dev assert: andres postgres [local] initializing(+0x802406) [0x562a449ec406]
postgres: dev assert: andres postgres [local] initializing(PostmasterMain+0x1676) [0x562a449ebd17]
postgres: dev assert: andres postgres [local] initializing(+0x6ec2e2) [0x562a448d62e2]
/lib/x86_64-linux-gnu/libc.so.6(+0x276ca) [0x7f1e820456ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f1e82045785]
postgres: dev assert: andres postgres [local] initializing(_start+0x21) [0x562a445ede21]

which is far from as useful as it could be.

A lot of platforms have "libbacktrace" available, e.g. as part of gcc. I think
we should consider using it, when available, to produce more useful
backtraces.

I hacked it up for ereport() to debug something, and the backtraces are
considerably better:

2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG: will crash
2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE:
[0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126
[0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461
[0x55fcd0310dd8] BackendStartup: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189
[0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779
[0x55fcd030c786] PostmasterMain: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1463
[0x55fcd01f6d51] main: ../../../../home/andres/src/postgresql/src/backend/main/main.c:198
[0x7fdd914456c9] __libc_start_call_main: ../sysdeps/nptl/libc_start_call_main.h:58
[0x7fdd91445784] __libc_start_main_impl: ../csu/libc-start.c:360
[0x55fccff0e890] [unknown]: [unknown]:0

The way each frame looks is my fault, not libbacktrace's...

Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:

https://github.com/ianlancetaylor/libbacktrace
As of October 2020, libbacktrace supports ELF, PE/COFF, Mach-O, and XCOFF
executables with DWARF debugging information. In other words, it supports
GNU/Linux, *BSD, macOS, Windows, and AIX. The library is written to make it
straightforward to add support for other object file and debugging formats.

The state I currently have is very hacky, but if there's interest in
upstreaming something like this, I could clean it up.

Greetings,

Andres Freund

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andres Freund (#1)
Re: Optionally using a better backtrace library?

ne 2. 7. 2023 v 20:32 odesílatel Andres Freund <andres@anarazel.de> napsal:

Hi,

I like that we now have a builtin backtrace ability. Unfortunately I think
the
backtraces are often not very useful, because only externally visible
functions are symbolized.

E.g.:

2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]] LOG:
will crash
2023-07-02 10:54:01.756 PDT [1398494][client backend][:0][[unknown]]
BACKTRACE:
postgres: dev assert: andres postgres [local]
initializing(errbacktrace+0xbb) [0x562a44c97ca9]
postgres: dev assert: andres postgres [local]
initializing(PostgresMain+0xb6) [0x562a44ac56d4]
postgres: dev assert: andres postgres [local]
initializing(+0x806add) [0x562a449f0add]
postgres: dev assert: andres postgres [local]
initializing(+0x806369) [0x562a449f0369]
postgres: dev assert: andres postgres [local]
initializing(+0x802406) [0x562a449ec406]
postgres: dev assert: andres postgres [local]
initializing(PostmasterMain+0x1676) [0x562a449ebd17]
postgres: dev assert: andres postgres [local]
initializing(+0x6ec2e2) [0x562a448d62e2]
/lib/x86_64-linux-gnu/libc.so.6(+0x276ca) [0x7f1e820456ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)
[0x7f1e82045785]
postgres: dev assert: andres postgres [local]
initializing(_start+0x21) [0x562a445ede21]

which is far from as useful as it could be.

A lot of platforms have "libbacktrace" available, e.g. as part of gcc. I
think
we should consider using it, when available, to produce more useful
backtraces.

I hacked it up for ereport() to debug something, and the backtraces are
considerably better:

2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG:
will crash
2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]]
BACKTRACE:
[0x55fcd03e6143] PostgresMain:
../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126
[0x55fcd031154c] BackendRun:
../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461
[0x55fcd0310dd8] BackendStartup:
../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189
[0x55fcd030ce75] ServerLoop:
../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779
[0x55fcd030c786] PostmasterMain:
../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1463
[0x55fcd01f6d51] main:
../../../../home/andres/src/postgresql/src/backend/main/main.c:198
[0x7fdd914456c9] __libc_start_call_main:
../sysdeps/nptl/libc_start_call_main.h:58
[0x7fdd91445784] __libc_start_main_impl: ../csu/libc-start.c:360
[0x55fccff0e890] [unknown]: [unknown]:0

The way each frame looks is my fault, not libbacktrace's...

Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to
figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:

https://github.com/ianlancetaylor/libbacktrace
As of October 2020, libbacktrace supports ELF, PE/COFF, Mach-O, and XCOFF
executables with DWARF debugging information. In other words, it supports
GNU/Linux, *BSD, macOS, Windows, and AIX. The library is written to make
it
straightforward to add support for other object file and debugging
formats.

The state I currently have is very hacky, but if there's interest in
upstreaming something like this, I could clean it up.

Looks nice

+1

Pavel

Show quoted text

Greetings,

Andres Freund

#3Joe Conway
mail@joeconway.com
In reply to: Andres Freund (#1)
Re: Optionally using a better backtrace library?

On 7/2/23 14:31, Andres Freund wrote:

Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:

https://github.com/ianlancetaylor/libbacktrace
As of October 2020, libbacktrace supports ELF, PE/COFF, Mach-O, and XCOFF
executables with DWARF debugging information. In other words, it supports
GNU/Linux, *BSD, macOS, Windows, and AIX. The library is written to make it
straightforward to add support for other object file and debugging formats.

The state I currently have is very hacky, but if there's interest in
upstreaming something like this, I could clean it up.

+1
Seems useful!

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#4Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andres Freund (#1)
Re: Optionally using a better backtrace library?

At Sun, 2 Jul 2023 11:31:56 -0700, Andres Freund <andres@anarazel.de> wrote in

The state I currently have is very hacky, but if there's interest in
upstreaming something like this, I could clean it up.

I can't help voting +1.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#5Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Andres Freund (#1)
Re: Optionally using a better backtrace library?

Hello,

On 2023-Jul-02, Andres Freund wrote:

I like that we now have a builtin backtrace ability. Unfortunately I think the
backtraces are often not very useful, because only externally visible
functions are symbolized.

Agreed, these backtraces are pretty close to useless. Not completely,
but I haven't found a practical way to use them for actual debugging
of production problems.

I hacked it up for ereport() to debug something, and the backtraces are
considerably better:

2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG: will crash
2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE:
[0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126
[0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461
[0x55fcd0310dd8] BackendStartup: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189
[0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779

Yeah, this looks much more usable.

Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:

Sadly, it looks like the library is seldom distributed. For example,
Debian seems to only have a package called android-libbacktrace which I
imagine is not what we want. On my system I see a static library only
-- is that enough? That file is part of package libgcc-10-dev, which
tells me that we can't depend on that for packaging purposes.

I think it's pretty much the same in the RPM side of the world.

So the only way to get this into customer systems would be to include
the library in our packages.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem." (Tom Lane)

#6David Steele
david@pgmasters.net
In reply to: Alvaro Herrera (#5)
Re: Optionally using a better backtrace library?

On 7/3/23 11:58, Alvaro Herrera wrote:

Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:

Sadly, it looks like the library is seldom distributed. For example,
Debian seems to only have a package called android-libbacktrace which I
imagine is not what we want. On my system I see a static library only
-- is that enough? That file is part of package libgcc-10-dev, which
tells me that we can't depend on that for packaging purposes.

It would be a pretty big win even if the improved backtrace is only
available in dev environments -- this is what pgBackRest currently does.

We are also considering adding this library to production builds but
have not pulled the trigger on that yet since we are a bit worried about
possible performance impact and have not had time to benchmark.

Regards,
-David

#7Peter Eisentraut
peter@eisentraut.org
In reply to: Andres Freund (#1)
Re: Optionally using a better backtrace library?

On 02.07.23 20:31, Andres Freund wrote:

A lot of platforms have "libbacktrace" available, e.g. as part of gcc. I think
we should consider using it, when available, to produce more useful
backtraces.

I hacked it up for ereport() to debug something, and the backtraces are
considerably better:

Makes sense. When we first added backtrace support, we considered
libunwind, which didn't really give better backtraces than the built-in
stuff, so it wasn't worth dealing with an additional dependency.

#8Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#5)
Re: Optionally using a better backtrace library?

Hi,

On 2023-07-03 11:58:25 +0200, Alvaro Herrera wrote:

On 2023-Jul-02, Andres Freund wrote:

I like that we now have a builtin backtrace ability. Unfortunately I think the
backtraces are often not very useful, because only externally visible
functions are symbolized.

Agreed, these backtraces are pretty close to useless. Not completely,
but I haven't found a practical way to use them for actual debugging
of production problems.

Yea. And I've grown pretty tired asking people to break out gdb in production
scenarios :/

Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:

Sadly, it looks like the library is seldom distributed.

It's often distributed as part of gcc.

For example, Debian seems to only have a package called android-libbacktrace
which I imagine is not what we want.

Indeed not.

On my system I see a static library only -- is that enough? That file is
part of package libgcc-10-dev, which tells me that we can't depend on that
for packaging purposes.

We should be able to depend on that gcc-NN depends on libgcc-NN-dev, it
contains all the compiler version specific stuff. It's where the intrinsics
headers, C runtime initialization, sanitizer libraries all live. clang will
typically also depend on libgcc-NN-dev on unixoid systems.

And since it's statically linked (and needs to be apparently), you don't need
libgcc-NN-dev installed at runtime.

I think it's pretty much the same in the RPM side of the world.

I don't know much about that side of the world...

Greetings,

Andres Freund

#9Tristan Partin
tristan@neon.tech
In reply to: Andres Freund (#8)
Re: Optionally using a better backtrace library?

On Mon Jul 3, 2023 at 12:43 PM CDT, Andres Freund wrote:

On 2023-07-03 11:58:25 +0200, Alvaro Herrera wrote:

On 2023-Jul-02, Andres Freund wrote:

Nice things about libbacktrace are that the generation of stack traces is
documented to be async signal safe on most platforms (with a #define to figure
that out, and a more minimal safe version always available) and that it
supports a wide range of platforms:

Sadly, it looks like the library is seldom distributed.

It's often distributed as part of gcc.

For example, Debian seems to only have a package called android-libbacktrace
which I imagine is not what we want.

Indeed not.

On my system I see a static library only -- is that enough? That file is
part of package libgcc-10-dev, which tells me that we can't depend on that
for packaging purposes.

We should be able to depend on that gcc-NN depends on libgcc-NN-dev, it
contains all the compiler version specific stuff. It's where the intrinsics
headers, C runtime initialization, sanitizer libraries all live. clang will
typically also depend on libgcc-NN-dev on unixoid systems.

And since it's statically linked (and needs to be apparently), you don't need
libgcc-NN-dev installed at runtime.

I think it's pretty much the same in the RPM side of the world.

I don't know much about that side of the world...

I could not find this packaged in Fedora. I did find it in FreeBSD
however. We could add libbacktrace as a Meson subproject.

--
Tristan Partin
Neon (https://neon.tech)

#10Noah Misch
noah@leadboat.com
In reply to: Alvaro Herrera (#5)
1 attachment(s)
Re: Optionally using a better backtrace library?

On Mon, Jul 03, 2023 at 11:58:25AM +0200, Alvaro Herrera wrote:

On 2023-Jul-02, Andres Freund wrote:

I like that we now have a builtin backtrace ability. Unfortunately I think the
backtraces are often not very useful, because only externally visible
functions are symbolized.

Agreed, these backtraces are pretty close to useless. Not completely,
but I haven't found a practical way to use them for actual debugging
of production problems.

For what it's worth, I use the attached script to convert the current
errbacktrace output to a fully-symbolized backtrace. Nonetheless, ...

I hacked it up for ereport() to debug something, and the backtraces are
considerably better:

2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] LOG: will crash
2023-07-02 10:52:54.863 PDT [1398207][client backend][:0][[unknown]] BACKTRACE:
[0x55fcd03e6143] PostgresMain: ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4126
[0x55fcd031154c] BackendRun: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461
[0x55fcd0310dd8] BackendStartup: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189
[0x55fcd030ce75] ServerLoop: ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779

Yeah, this looks much more usable.

... +1 for offering this.

Attachments:

errbacktrace2linetext/plain; charset=us-asciiDownload
#11Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Noah Misch (#10)
Re: Optionally using a better backtrace library?

On 2023-Sep-04, Noah Misch wrote:

On Mon, Jul 03, 2023 at 11:58:25AM +0200, Alvaro Herrera wrote:

Agreed, these backtraces are pretty close to useless. Not completely,
but I haven't found a practical way to use them for actual debugging
of production problems.

For what it's worth, I use the attached script to convert the current
errbacktrace output to a fully-symbolized backtrace.

Much appreciated! I can put this to good use.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

In reply to: Alvaro Herrera (#11)
Re: Optionally using a better backtrace library?

On Tue, Sep 5, 2023 at 2:59 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Much appreciated! I can put this to good use.

I was just reminded of how our existing backtrace support is lacklustre.

Are you planning on submitting a patch for this?

--
Peter Geoghegan