elog/ereport VS misleading backtrace_function function address

Started by Jakub Wartakalmost 2 years ago9 messages

jakub.wartak@enterprisedb.com

almost 2 years ago

Hi -hackers,

While chasing some other bug I've learned that backtrace_functions
might be misleading with top elog/ereport() address.

Reproducer:

# using Tom's reproducer on master:
wget /messages/by-id/attachment/112394/ri-collation-bug-example.sql
echo '' >> ri-collation-bug-example.sql
echo '\errverbose' >> ri-collation-bug-example.sql
-- based on grepping the source code locations many others could go in here:
psql -p 5437 -c "alter system set backtrace_functions =
'RI_FKey_cascade_del,get_collation_isdeterministic';"
psql -p 5437 -c "select pg_reload_conf();"
dropdb -p 5437 test1 ; createdb -p 5437 test1 ; psql -p 5437 -d test1
-f ri-collation-bug-example.sql

gives (note "get_collation_isdeterministic"):
psql:ri-collation-bug-example.sql:42: ERROR: cache lookup failed
for collation 0
psql:ri-collation-bug-example.sql:44: error: ERROR: XX000: cache
lookup failed for collation 0
LOCATION: get_collation_isdeterministic, lsyscache.c:1088

and in log note the 0x14c6bb:
2024-03-27 14:39:13.065 CET [12899] postgres@test1 ERROR: cache
lookup failed for collation 0
2024-03-27 14:39:13.065 CET [12899] postgres@test1 BACKTRACE:
postgres: 16/main: postgres test1 [local] DELETE(+0x14c6bb)
[0x5565c5a9c6bb]
postgres: 16/main: postgres test1 [local]
DELETE(RI_FKey_cascade_del+0x323) [0x5565c5ec0973]
postgres: 16/main: postgres test1 [local] DELETE(+0x2d401f)
[0x5565c5c2401f]
postgres: 16/main: postgres test1 [local] DELETE(+0x2d5a04)
[0x5565c5c25a04]
postgres: 16/main: postgres test1 [local]
DELETE(AfterTriggerEndQuery+0x78) [0x5565c5c2a918]
[..]
2024-03-27 14:39:13.065 CET [12899] postgres@test1 STATEMENT: delete
from revisions where id='5c617ce7-688d-4bea-9d66-c0f0ebc635da';

based on \errverbose it is OK (error matches the code, thanks to
Alvaro for this hint):

9 bool
8 get_collation_isdeterministic(Oid colloid)
7 {
6 │ HeapTuple> tp;
5 │ Form_pg_collation colltup;
4 │ bool> > result;
3 │
2 │ tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(colloid));
1 │ if (!HeapTupleIsValid(tp))
1088 │ │ elog(ERROR, "cache lookup failed for collation %u", colloid);
[..]

but based on backtrace address 0x14c6bb (!) and it resolves
differently which seems to be highly misleading (note the
"get_language_name.cold"):

$ addr2line -e /usr/lib/postgresql/16/bin/postgres -a -f 0x14c6bb
0x000000000014c6bb
get_language_name.cold
./build/src/backend/utils/cache/./build/../src/backend/utils/cache/lsyscache.c:1181

When i disassemble the get_collation_isdeterministic() (and I know the
name from \errverbose):

Dump of assembler code for function get_collation_isdeterministic:
Address range 0x5c7680 to 0x5c76c1:
0x00000000005c7680 <+0>: push %rbp
[..]
0x00000000005c7694 <+20>: call 0x5d63b0 <SearchSysCache1>
0x00000000005c7699 <+25>: test %rax,%rax
0x00000000005c769c <+28>: je 0x14c686
<get_collation_isdeterministic.cold>
0x00000000005c76a2 <+34>: mov %rax,%rdi
[..]
0x00000000005c76bf <+63>: leave
0x00000000005c76c0 <+64>: ret
Address range 0x14c686 to 0x14c6bb:
0x000000000014c686 <-4698106>: xor %esi,%esi
0x000000000014c688 <-4698104>: mov $0x15,%edi
0x000000000014c68d <-4698099>: call 0x14ec86 <errstart_cold>
0x000000000014c692 <-4698094>: mov %r12d,%esi
0x000000000014c695 <-4698091>: lea 0x5028dc(%rip),%rdi
# 0x64ef78
0x000000000014c69c <-4698084>: xor %eax,%eax
0x000000000014c69e <-4698082>: call 0x5df320 <errmsg_internal>
0x000000000014c6a3 <-4698077>: lea 0x6311a6(%rip),%rdx
# 0x77d850 <__func__.34>
0x000000000014c6aa <-4698070>: mov $0x440,%esi
0x000000000014c6af <-4698065>: lea 0x630c8a(%rip),%rdi
# 0x77d340
0x000000000014c6b6 <-4698058>: call 0x5df100 <errfinish>
<< NOTE the exact 0x14c6bb is even missing here(!)

notice the interesting thing here: according to GDB
get_collation_isdeterministic() is @ 0x5c7680 + jump to 0x14c686
<get_collation_isdeterministic.cold> till 0x14c6bb (but without it)
without any stack setup for that .cold. But backtrace() just captured
the elog/ereport (cold) at the end of 0x14c6bb instead, so if I take
that exact address from backtrace_functions output as it is
("DELETE(+0x14c6bb)"), it also shows WRONG function (just like
addr2line):

(gdb) disassemble 0x14c6bb
Dump of assembler code for function get_language_name:
Address range 0x5c7780 to 0x5c77ee:
[..]
0x00000000005c77ed <+109>: ret
Address range 0x14c6bb to 0x14c6f0:
0x000000000014c6bb <-4698309>: xor %esi,%esi
[..]
0x000000000014c6eb <-4698261>: call 0x5df100 <errfinish>
End of assembler dump.

so this is wrong (as are failing in "get_collation_isdeterministic"
not in "get_language_name").

I was very confused at the beginning with the main question being: why
are we compiling elog/ereport() so that it is incompatible with
backtrace? When looking for it I've found two threads [1]/messages/by-id/CAApHDvoWoxtbWiqZxrhO+i9NoG56AWHDzuDDD+1OEf4PxzFhig@mail.gmail.com by David
that were actually helpful in understanding that this was done for
performance (unlikley(), cold attribute and similiar type of
discussions). Now my thought is that for >= ERROR in ereport_domain we
could add something more (?) before pg_unreachable() that would help
the backtrace to resolve the symbol correctly

If I try non-portable PoC with x86 nop instruction:

--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -145,8 +145,10 @@ struct Node;
                        errstart_cold(elevel, domain) : \
                        errstart(elevel, domain)) \
                        __VA_ARGS__, errfinish(__FILE__, __LINE__, __func__); \
-               if (__builtin_constant_p(elevel) && (elevel) >= ERROR) \
+               if (__builtin_constant_p(elevel) && (elevel) >= ERROR) { \
+                       __asm__ __volatile__("nop"); \
                        pg_unreachable(); \
+               } \

it partially works and the address can be now properly resolved!

Somewhat backtrace() still is unable to lookup the to do so in log itself:
postgres: postgres test1 [local] DELETE(+0x15f84c) [0x55703bf5284c]
postgres: postgres test1 [local]
DELETE(RI_FKey_cascade_del+0x2b0) [0x55703c34c850]

but at least:

postgres@hive:~$ addr2line -e /usr/pgsql17/bin/postgres -a -f 0x15f84c
0x000000000015f84c
get_collation_isdeterministic
/git/postgres/src/backend/utils/cache/lsyscache.c:1062 (discriminator 4)
postgres@hive:~$

in gdb it will point right to our new nop:
0x000000000015f840 <-4490944>: lea 0x608306(%rip),%rdi
# 0x767b4d
0x000000000015f847 <-4490937>: call 0x5be1d0 <errfinish>
0x000000000015f84c <-4490932>: nop

Thoughts? Does it make sense to post a patch (for pg18?)? How to do it
in $arch-independent way? I've tried also to try to find a way with
e.g. -rdynamic to show that real function name, but it looks like
without some more serious unwind library it seems unrealistic (?) to
resolve that get_collation_isdeterministic.cold

-J.

[1]: /messages/by-id/CAApHDvoWoxtbWiqZxrhO+i9NoG56AWHDzuDDD+1OEf4PxzFhig@mail.gmail.com
[2]: /messages/by-id/CAKJS1f8yqRW3qx2CO9r4bqqvA2Vx68=3awbh8CJWTP9zXeoHMw@mail.gmail.com

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: Jakub Wartak (#1)

Re: elog/ereport VS misleading backtrace_function function address

Jakub Wartak <jakub.wartak@enterprisedb.com> writes:

While chasing some other bug I've learned that backtrace_functions
might be misleading with top elog/ereport() address.

That was understood from the beginning: this type of backtrace is
inherently pretty imprecise, and I doubt there is much that can
be done to make it better. IIRC the fundamental problem is it only
looks at global symbols, so static functions inherently defeat it.
It was argued that this is better than nothing, which is true, but
you have to take the info with a mountain of salt.

I recall speculating about whether we could somehow invoke gdb
to get a more comprehensive and accurate backtrace, but I don't
really have a concrete idea how that could be made to work.

regards, tom lane

Jakub Wartak

jakub.wartak@enterprisedb.com

over 1 year ago

In reply to: Tom Lane (#2)

1 attachment(s)

Re: elog/ereport VS misleading backtrace_function function address

Hi Tom and -hackers!

On Thu, Mar 28, 2024 at 7:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jakub Wartak <jakub.wartak@enterprisedb.com> writes:

While chasing some other bug I've learned that backtrace_functions
might be misleading with top elog/ereport() address.

That was understood from the beginning: this type of backtrace is
inherently pretty imprecise, and I doubt there is much that can
be done to make it better. IIRC the fundamental problem is it only
looks at global symbols, so static functions inherently defeat it.
It was argued that this is better than nothing, which is true, but
you have to take the info with a mountain of salt.

OK, point taken, thanks for describing the limitations, still I find
backtrace_functions often the best thing we have primarily due its
simplicity and ease of use (for customers). I found out simplest
portable way to generate NOP (or any other instruction that makes the
problem go away):

with the reproducer, psql returns:

psql:ri-collation-bug-example.sql:48: error: ERROR: XX000: cache
lookup failed for collation 0
LOCATION: get_collation_isdeterministic, lsyscache.c:1062

logfile without patch:

2024-05-07 09:05:43.043 CEST [44720] ERROR: cache lookup failed for collation 0
2024-05-07 09:05:43.043 CEST [44720] BACKTRACE:
postgres: postgres postgres [local] DELETE(+0x155883) [0x55ce5a86a883]
postgres: postgres postgres [local]
DELETE(RI_FKey_cascade_del+0x2b0) [0x55ce5ac6dfd0]
postgres: postgres postgres [local] DELETE(+0x2d488b) [0x55ce5a9e988b]
[..]

$ addr2line -e /usr/pgsql18/bin/postgres -a -f 0x155883
0x0000000000155883
get_constraint_type.cold <<<<<< so it's wrong as the real function
should be get_collation_isdeterministic

logfile with attached patch:

2024-05-07 09:11:06.596 CEST [51185] ERROR: cache lookup failed for collation 0
2024-05-07 09:11:06.596 CEST [51185] BACKTRACE:
postgres: postgres postgres [local] DELETE(+0x168bf0) [0x560e1cdfabf0]
postgres: postgres postgres [local]
DELETE(RI_FKey_cascade_del+0x2b0) [0x560e1d200c00]
postgres: postgres postgres [local] DELETE(+0x2e90fb) [0x560e1cf7b0fb]
[..]

$ addr2line -e /usr/pgsql18/bin/postgres -a -f 0x168bf0
0x0000000000168bf0
get_collation_isdeterministic.cold <<< It's ok with the patch

NOTE: in case one will be testing this: one cannot ./configure with
--enable-debug as it prevents the compiler optimizations that actually
end up with the ".cold" branch optimizations that cause backtrace() to
return the wrong address.

I recall speculating about whether we could somehow invoke gdb
to get a more comprehensive and accurate backtrace, but I don't
really have a concrete idea how that could be made to work.

Oh no, I'm more about micro-fix rather than doing some bigger
revolution. The patch simply adds this one instruction in macro, so
that now backtrace_functions behaves better:

0x0000000000773d28 <+105>: call 0x79243f <errfinish>
0x0000000000773d2d <+110>: movzbl -0x12(%rbp),%eax << this ends
up being added by the patch
0x0000000000773d31 <+114>: call 0xdc1a0 <abort@plt>

I'll put that as for PG18 in CF, but maybe it could be backpatched too
- no hard opinion on that (I don't see how that ERROR/FATAL path could
cause any performance regressions)

-J.

Attachments:

v1-0001-Tweak-ereport-so-that-it-generates-proper-address.patchapplication/octet-stream; name=v1-0001-Tweak-ereport-so-that-it-generates-proper-address.patchDownload

From 69a38692cc3d9066f2750577cfdcbe5f6e02d2a5 Mon Sep 17 00:00:00 2001
From: Jakub Wartak <jakub.wartak@enterprisedb.com>
Date: Tue, 7 May 2024 09:23:41 +0200
Subject: [PATCH v1] Tweak ereport so that it generates proper addresses for
 .cold branches

Without this some ereport() could cause confusion when used with
backtrace_functions as they would point to wrong addresses and thus to wrong
functions being visible in logfile.
---
 src/include/utils/elog.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 054dd2bf62..f4f6e8d42d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -145,8 +145,11 @@ struct Node;
 			errstart_cold(elevel, domain) : \
 			errstart(elevel, domain)) \
 			__VA_ARGS__, errfinish(__FILE__, __LINE__, __func__); \
-		if (__builtin_constant_p(elevel) && (elevel) >= ERROR) \
+		if (__builtin_constant_p(elevel) && (elevel) >= ERROR) { \
+			volatile char fix_backtrace_addr; \
+			(void) fix_backtrace_addr; \
 			pg_unreachable(); \
+		} \
 	} while(0)
 #else							/* !HAVE__BUILTIN_CONSTANT_P */
 #define ereport_domain(elevel, domain, ...)	\
@@ -155,8 +158,11 @@ struct Node;
 		pg_prevent_errno_in_scope(); \
 		if (errstart(elevel_, domain)) \
 			__VA_ARGS__, errfinish(__FILE__, __LINE__, __func__); \
-		if (elevel_ >= ERROR) \
+		if (elevel_ >= ERROR) {\
+			volatile char fix_backtrace_addr; \
+			(void) fix_backtrace_addr; \
 			pg_unreachable(); \
+		} \
 	} while(0)
 #endif							/* HAVE__BUILTIN_CONSTANT_P */
 
-- 
2.30.2

Peter Eisentraut

peter@eisentraut.org

over 1 year ago

In reply to: Jakub Wartak (#3)

Re: elog/ereport VS misleading backtrace_function function address

On 07.05.24 09:43, Jakub Wartak wrote:

NOTE: in case one will be testing this: one cannot ./configure with
--enable-debug as it prevents the compiler optimizations that actually
end up with the ".cold" branch optimizations that cause backtrace() to
return the wrong address.

Is that configuration useful? If you're interested in backtraces,
wouldn't you also want debug symbols? Don't production builds use debug
symbols nowadays as well?

I recall speculating about whether we could somehow invoke gdb
to get a more comprehensive and accurate backtrace, but I don't
really have a concrete idea how that could be made to work.

Oh no, I'm more about micro-fix rather than doing some bigger
revolution. The patch simply adds this one instruction in macro, so
that now backtrace_functions behaves better:

0x0000000000773d28 <+105>: call 0x79243f <errfinish>
0x0000000000773d2d <+110>: movzbl -0x12(%rbp),%eax << this ends
up being added by the patch
0x0000000000773d31 <+114>: call 0xdc1a0 <abort@plt>

I'll put that as for PG18 in CF, but maybe it could be backpatched too
- no hard opinion on that (I don't see how that ERROR/FATAL path could
cause any performance regressions)

I'm missing a principled explanation of what this does. I just see that
it sprinkles some no-op code to make this particular thing work a bit
differently, but this might just behave differently with the next
compiler next year. I'd like to see a bit more of an abstract
explanation of what is happening here.

Jakub Wartak

jakub.wartak@enterprisedb.com

over 1 year ago

In reply to: Peter Eisentraut (#4)

Re: elog/ereport VS misleading backtrace_function function address

Hi Peter!

On Sun, May 12, 2024 at 10:33 PM Peter Eisentraut <peter@eisentraut.org> wrote:

On 07.05.24 09:43, Jakub Wartak wrote:

NOTE: in case one will be testing this: one cannot ./configure with
--enable-debug as it prevents the compiler optimizations that actually
end up with the ".cold" branch optimizations that cause backtrace() to
return the wrong address.

Is that configuration useful? If you're interested in backtraces,
wouldn't you also want debug symbols?

The use case here is that backtrace_functions is unreliable when we
ask customers on production builds to use it (so it's useful not just
for local tests)

Don't production builds use debug
symbols nowadays as well?

Reality is apparently mixed,at least from what I have checked :
- all RHEL 7.x/8.x (PGDG and our forks) do NOT come with
--enable-debug according to pg_config.
- on Debian 11/12 PGDG does come with --enable-debug

I recall speculating about whether we could somehow invoke gdb
to get a more comprehensive and accurate backtrace, but I don't
really have a concrete idea how that could be made to work.

Oh no, I'm more about micro-fix rather than doing some bigger
revolution. The patch simply adds this one instruction in macro, so
that now backtrace_functions behaves better:

0x0000000000773d28 <+105>: call 0x79243f <errfinish>
0x0000000000773d2d <+110>: movzbl -0x12(%rbp),%eax << this ends
up being added by the patch
0x0000000000773d31 <+114>: call 0xdc1a0 <abort@plt>

I'll put that as for PG18 in CF, but maybe it could be backpatched too
- no hard opinion on that (I don't see how that ERROR/FATAL path could
cause any performance regressions)

I'm missing a principled explanation of what this does. I just see that
it sprinkles some no-op code to make this particular thing work a bit
differently, but this might just behave differently with the next
compiler next year. I'd like to see a bit more of an abstract
explanation of what is happening here.

OK I'll try to explain using assembly, but I'm not an expert on this.
Let's go to the 1st post, assume we run with backtrace_function set:

get_collation_isdeterministic() 0x5c7680 to 0x5c76c1:
might jump(!) to 0x14c686 <get_collation_isdeterministic.cold>
note the two completely different address ranges (hot and separate cold)
it's because of 913ec71d682 that added the cold branches
optimization via pg_attribute_cold to errstart_cold

Dump of assembler code for function get_collation_isdeterministic:
Address range 0x5c7680 to 0x5c76c1:
0x00000000005c7680 <+0>: push %rbp
[..]
0x00000000005c7694 <+20>: call 0x5d63b0 <SearchSysCache1>
0x00000000005c7699 <+25>: test %rax,%rax
0x00000000005c769c <+28>: je 0x14c686
<get_collation_isdeterministic.cold>
[..]
0x00000000005c76b3 <+51>: call 0x5d6430 <ReleaseSysCache>
0x00000000005c76b8 <+56>: mov %r12d,%eax
0x00000000005c76bb <+59>: mov -0x8(%rbp),%r12
0x00000000005c76bf <+63>: leave
0x00000000005c76c0 <+64>: ret
Address range 0x14c686 to 0x14c6bb:
// note here that the last
instruction is 0x14c6b6 not 0x14c6bb(!)
// note this cold path also has no
frame pointer setup
0x000000000014c686 <-4698106>: xor %esi,%esi
0x000000000014c688 <-4698104>: mov $0x15,%edi
0x000000000014c68d <-4698099>: call 0x14ec86 <errstart_cold>
0x000000000014c692 <-4698094>: mov %r12d,%esi
0x000000000014c695 <-4698091>: lea 0x5028dc(%rip),%rdi
# 0x64ef78
0x000000000014c69c <-4698084>: xor %eax,%eax
0x000000000014c69e <-4698082>: call 0x5df320 <errmsg_internal>
0x000000000014c6a3 <-4698077>: lea 0x6311a6(%rip),%rdx
# 0x77d850 <__func__.34>
0x000000000014c6aa <-4698070>: mov $0x440,%esi
0x000000000014c6af <-4698065>: lea 0x630c8a(%rip),%rdi
# 0x77d340
0x000000000014c6b6 <-4698058>: call 0x5df100 <errfinish>
End of assembler dump.

so it's
get_collation_isdeterministic() ->
get_collation_isdeterministic.cold() [but not real function ->
no proper fp/stack?] ->
.. ->
errfinish() ->
set_backtrace() // just builds and appends string
using backtrace()/backtrace_functions()/backtrace_symbol_list().
prints/logs, finishes

It seems that the thing is that the limited GNU libc
backtrace_symbol_list() won't resolve the 0x14c6b6..0x14c6bb to the
"get_collation_isdeterministic[.cold]" symbol name and it will just
simply put 0x14c6bb in that the text asm. It's wrong and it is
confusing:

2024-03-27 14:39:13.065 CET [12899] postgres(at)test1 ERROR: cache
lookup failed for collation 0
2024-03-27 14:39:13.065 CET [12899] postgres(at)test1 BACKTRACE:
postgres: 16/main: postgres test1 [local] DELETE(+0x14c6bb)
[0x5565c5a9c6bb]
postgres: 16/main: postgres test1 [local]
DELETE(RI_FKey_cascade_del+0x323) [0x5565c5ec0973]
[..]

if you now take addr2line for 0x14c6bb it will be resolved to the next
assembly AFTER where it had happened: we did blow up in
<0x14c686..0x14c6bb), but it put 0x14c6bb as the IP:

$ addr2line -e /usr/lib/postgresql/16/bin/postgres -a -f 0x14c6bb
0x000000000014c6bb
get_language_name.cold

^ it's wrong,! but GDB also gets the same (GIGO):

(gdb) disassemble 0x14c6bb
Dump of assembler code for function get_language_name:
Address range 0x5c7780 to 0x5c77ee:
[..]
0x00000000005c77da <+90>: je 0x14c6bb <get_language_name.cold>
[..]
0x00000000005c77ed <+109>: ret
Address range 0x14c6bb to 0x14c6f0: // <<< HERE
0x000000000014c6bb <-4698309>: xor %esi,%esi
[..]

So something was saved in the wrong way (fp?), however adding that 1
trivial NOP instruction between errfinish() and pg_unreachable() where
afterwards somehow it returns, does make the returned address (using
addr2line) resolve to the proper function. My limited understanding is
that we simply pushed the compiler into generating that cold path
(that has no %ebp/%rbp frame pointer assembly) followed by no-exit in
asm code that prevents backtrace() from working. So it looks like we
are playing hard at micro-optimizations (cold branch optimizations,
again 913ec71d682), we provide backtrace_functions GUC and we provide
various packaging ways, and that somehow bite me. The backtrace(3)
says `Omission of the frame pointers (as implied by any of gcc(1)'s
nonzero optimization levels) may cause these assumptions to be
violated.`

-J.

Alvaro Herrera

alvherre@alvh.no-ip.org

over 1 year ago

In reply to: Jakub Wartak (#5)

Re: elog/ereport VS misleading backtrace_function function address

On 2024-May-14, Jakub Wartak wrote:

On Sun, May 12, 2024 at 10:33 PM Peter Eisentraut <peter@eisentraut.org> wrote:

Don't production builds use debug
symbols nowadays as well?

Reality is apparently mixed,at least from what I have checked :
- all RHEL 7.x/8.x (PGDG and our forks) do NOT come with
--enable-debug according to pg_config.

Ooh, yeah, that's true according to
https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16.spec;h=ab2f6edc903f083e04b8f7a1d3bad8e1b7018791;hb=1a8b9fa7019d3f73322ca873b62dc0b33e73ed1d

507 %if %beta
508 --enable-debug \
509 --enable-cassert \
510 %endif

Maybe a better approach for this whole thing is to change the specs so
that --enable-debug is always given, not just for %beta.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Alvaro Herrera (#6)

Re: elog/ereport VS misleading backtrace_function function address

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

On 2024-May-14, Jakub Wartak wrote:

Reality is apparently mixed,at least from what I have checked :
- all RHEL 7.x/8.x (PGDG and our forks) do NOT come with
--enable-debug according to pg_config.

Ooh, yeah, that's true according to
https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16.spec;h=ab2f6edc903f083e04b8f7a1d3bad8e1b7018791;hb=1a8b9fa7019d3f73322ca873b62dc0b33e73ed1d

507 %if %beta
508 --enable-debug \
509 --enable-cassert \
510 %endif

Maybe a better approach for this whole thing is to change the specs so
that --enable-debug is always given, not just for %beta.

My recollection from my time at Red Hat is that their standard policy
is to build everything with debug symbols all the time; so this is
violating that policy, and we should change it just on those grounds.
However, I'm not sure how much the change will help Joe Average User
with respect to the thread topic. RH actually has infrastructure that
splits the debug symbol tables out into separate "debuginfo" RPMs,
which you have to install manually if you want to debug a particular
package. This is good for disk space consumption, but it means that
most users are still only going to see the same backtrace they see
currently.

I don't know how much of that applies to, eg, Debian.

regards, tom lane

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Jakub Wartak (#5)

Re: elog/ereport VS misleading backtrace_function function address

On Tue, May 14, 2024 at 6:13 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:

OK I'll try to explain using assembly, but I'm not an expert on this.
Let's go to the 1st post, assume we run with backtrace_function set:

I feel like this explanation doesn't really explain very much. I mean,
the question is not "how is it that adding a nop instruction fixes
anything?" but "is adding a nop instruction a principled way of fixing
things, and if so, for what reason?". And as far as I can see, you
haven't answered that question anywhere. Unless we really understand
why the results are better with that nop instruction added, we can't
possibly have any confidence that this is anything other than random
good fortune, which isn't a sufficiently good reason to make a change.
And, while I'm no expert on this, I suspect that it is mostly just
random good fortune -- the fact that inserting a volatile variable
declaration here solved the problem seems like something that could
easily fail to work out on another platform or compiler or compiler
version. I also think it's the wrong idea on principle to insert junk
declarations into our code to try to get good backtraces.

I think the right answer here is probably what Alvaro said, namely,
that people have to have the debug symbols installed if they want to
get backtraces. Tom is probably correct when he says that there's
nothing we can do to ensure that users end up with debug symbols in
all cases, but that doesn't mean it's the wrong solution. It's at
least understandable why it works.

--
Robert Haas
EDB: http://www.enterprisedb.com

Jakub Wartak

jakub.wartak@enterprisedb.com

over 1 year ago

In reply to: Robert Haas (#8)

1 attachment(s)

Re: elog/ereport VS misleading backtrace_function function address

Hi Robert,

On Tue, May 14, 2024 at 5:36 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 14, 2024 at 6:13 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:

OK I'll try to explain using assembly, but I'm not an expert on this.
Let's go to the 1st post, assume we run with backtrace_function set:

I feel like this explanation doesn't really explain very much. I mean,
the question is not "how is it that adding a nop instruction fixes
anything?" but "is adding a nop instruction a principled way of fixing
things, and if so, for what reason?".

Short: IMHO the backtrace() is reliable and cannot be blamed here,
it's just that the __attribute__((cold)) with __builtin_unreachable
prevents backtrace() from retrieving proper data from the stack.

Long: using compiler cold branches with __builtin_unreachable() causes
the compiler to generate specific code that is misleading GNU libc's
backtrace() and tools like addr2line/gdb/objdump. The exact
instruction that causes it is that CALL assembly stores the saved RIP
onto the stack frame. So, if you have something like this (that's from
attached trimmed C semi-reproducer that tries to mimic what PG is
doing with ereport() macro and cold path -- NOTE: not exact match for
instructions, but good enough to see problems related with it):

$ cat bt.c
[..]
int main() {
int r;
printf("starting f(), NOT yet in somemisleading_another_f() !\n");
r = f();
printf("%d\n", r*2);
printf("starting somemisleading_another_f()\n");
r = somemisleading_another_f();
[..]

$ gcc-12 -Wall -rdynamic -g -O2 bt.c -o bt
$ ./bt
starting f(), NOT yet in somemisleading_another_f() !
err_cold(): 3
errstatic(): 3
./bt(dump_current_backtrace+0x35) [0x564fbb25d2d5]
./bt(err_cold+0x2f) [0x564fbb25d11f]
./bt(+0x1133) [0x564fbb25d133]
./bt(+0x1144) [0x564fbb25d144]
./bt(main+0x23) [0x564fbb25d173]
/lib/x86_64-linux-gnu/libc.so.6(+0x2724a) [0x7fba4028624a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7fba40286305]
./bt(_start+0x21) [0x564fbb25d1d1]

finishing demo() macro

$ addr2line -e ./bt -f 0x1133
f
/root/bt.c:84

$ addr2line -e ./bt -f 0x1144
somemisleading_another_f
/root/bt.c:84

NOTE here that the code has NOT called somemisleading_another_f() at
all, but backtrace() is going to fetch 0x1144, because such RIP value
was stored by the CALL. In asm, it looks like this, in the brackets
I've put sequence of instructions:

0000000000001125 <f.part.0>:
1125: 55 push %rbp [4]
1126: bf 03 00 00 00 mov $0x3,%edi [5]
112b: 48 89 e5 mov %rsp,%rbp [6]
112e: e8 bd ff ff ff call 10f0 <err_cold> [7 -->
.. backtrace()]
1133: bf 03 00 00 00 mov $0x3,%edi
1138: e8 73 02 00 00 call 13b0 <err_finish>

000000000000113d <f.cold>:
113d: 31 c0 xor %eax,%eax [2]
113f: e8 e1 ff ff ff call 1125 <f.part.0> [3 -->
this is going to store 0x113f+4b = 0x1144 as saved RIP onto stack
frame]

0000000000001144 <somemisleading_another_f.cold>:
1144: 31 c0 xor %eax,%eax
1146: e8 da ff ff ff call 1125 <f.part.0>
114b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)

[..]
00000000000013e0 <f>:
[..]
13ed: e8 7e fc ff ff call 1070 <getpid@plt>
13f2: a8 01 test $0x1,%al
13f4: 0f 85 43 fd ff ff jne 113d <f.cold> [1]

So here, just like in PostgreSQL, in the f.cold the "CALL"@0x113f is
going to save RIP of the following instruction (0x1144), but that's
from a different (next) function in address layout. So when
backtrace() is going to retrieve that stack frame, it's going to see
something belonging to a function that could not even be called
physically. The thing is that path is never going to return (and
that's also caused by usage of __builtin_unreachable()), so for the
compiler there's no need to generate any other instruction after such
CALL. There's no such need, of course unless you want to have valid
backtrace() output. Storing literally any instruction after such CALL,
causes the savedRIP in the stack frame to point back to the proper
origin function.

So again, with on fresh PG18 today (compiled with: ./configure
--prefix=/usr/pgsql18 ), reproducer gives shows location error as
get_collation_isdeterministic() in psql:

psql:ri-collation-bug-example.sql:42: ERROR: cache lookup failed for
collation 0
psql:ri-collation-bug-example.sql:44: error: ERROR: XX000: cache
lookup failed for collation 0
LOCATION: get_collation_isdeterministic, lsyscache.c:1062 // \errverbose

it's evident that backtrace_functions produces garbage:

2024-08-26 10:00:22.317 CEST [39311] STATEMENT: delete from revisions
where id='5c617ce7-688d-4bea-9d66-c0f0ebc635da';
2024-08-26 10:00:45.216 CEST [39325] ERROR: cache lookup failed for collation 0
2024-08-26 10:00:45.216 CEST [39325] BACKTRACE:
postgres: postgres test1 [local] DELETE(+0x155898) [0x55e9161be898]
postgres: postgres test1 [local]
DELETE(RI_FKey_cascade_del+0x2a9) [0x55e9165c2479]
postgres: postgres test1 [local] DELETE(+0x2d7e1f) [0x55e916340e1f]
[..]

so what's the 0x155898 ? We get:
$ addr2line -e /usr/pgsql18/bin/postgres -a -f 0x155898
0x0000000000155898
get_constraint_type.cold
lsyscache.c:?

That's wrong, we should get "get_collation_isdeterministic". Now with
the patch backtrace_functions produces:

2024-08-26 10:10:26.151 CEST [49283] ERROR: cache lookup failed for collation 0
2024-08-26 10:10:26.151 CEST [49283] BACKTRACE:
postgres: postgres test2 [local] DELETE(+0x16964f) [0x55dc64f4064f]
postgres: postgres test2 [local]
DELETE(RI_FKey_cascade_del+0x2a9) [0x55dc65343559]
[..]
2024-08-26 10:10:26.151 CEST [49283] STATEMENT: delete from revisions
where id='5c617ce7-688d-4bea-9d66-c0f0ebc635da';
$ addr2line -e /usr/pgsql18/bin/postgres -a -f 0x16964f
0x000000000016964f
get_collation_isdeterministic.cold
lsyscache.c:?

I think the right answer here is probably what Alvaro said, namely, that people have to have the debug symbols installed if they want to get backtraces.

Yes, the binary cannot be strip(1)-ed (ELF symbols table '.symtab'
cannot be emptied) for it to work OR the binary can be striped, but
then the debug symbols need to be installed as add-on in order to
decode the addresses, but that's standard thing today and not such
installation is not a problem in general.

Fun fact: one thing worth mentioning here is that I was plain wrong:
it is NOT --enable-debug that is causing - or not - generating .cold
functions. Those seem to be almost always mostly generated on modern
gcc compilers. E.g. Debian official PGDG packages come without ".cold"
function optimizations VISIBLE within the binary by default when using
normal objdump(1). objdump(1) was unable to resolve anything to have
".cold", but following assembly, it looked like those stubs were
present. Now using e.g. objdump --dwarf
--disassemble=get_collation_oid.cold (the key here is to use --dwarf),
I was able to get those function names with .cold . The --dwarf switch
actually starts reading the external ("/usr/lib/debug/.build-id/")
symbols and confirms they are there.

At first I thought that clang does not seem to be affected by this
when dealing with my toy bt.c (actually clang fixes it!), it looks
like it is not generating those ".cold" stubs at all. However the
issue is still the wrong output in PG even with clang ! (I've got the
wrong function output in log too, and again there was CALL as last
instruction, so it saved the wrong RIP of the follow-up instruction
too!). My org. patch does not fix it there, as sadly clang optimized
out my volatile char. To my disgust, what helped there was the below
thing:

--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -145,8 +145,11 @@ struct Node;
-               if (__builtin_constant_p(elevel) && (elevel) >= ERROR) \
+               if (__builtin_constant_p(elevel) && (elevel) >= ERROR) { \
+                       volatile char fix_backtrace_addr = 0x42; \
+                       fix_backtrace_addr = fix_backtrace_addr + 1; \
                        pg_unreachable(); \

It's kind of ugly hack, maybe some other hackers have better ideas.
Also I have not checked different archs that the x86_64 (and part of
the problem is that it needs a CPU-agnostic operand).

-J.