Should we add a compiler warning for large stack frames?
Hi,
d8f5acbdb9b made me wonder if we should add a compiler option to warn when
stack frames are large. gcc compatible compilers have -Wstack-usage=limit, so
that's not hard.
Huge stack frames are somewhat dangerous, as they can defeat our stack-depth
checking logic. There are also some cases where large stack frames defeat
stack-protector logic by compilers/libc/os.
It's not always obvious how large the stack will be. Even if you look at all
the sizes of the variables defined in a function, inlining can increase that
substantially.
Here are all the cases a limit of 64k finds.
Unoptimized build:
[138/1940 42 7%] Compiling C object src/common/libpgcommon_shlib.a.p/blkreftable.c.o
../../../../../home/andres/src/postgresql/src/common/blkreftable.c: In function 'WriteBlockRefTable':
../../../../../home/andres/src/postgresql/src/common/blkreftable.c:474:1: warning: stack usage is 65696 bytes [-Wstack-usage=]
474 | WriteBlockRefTable(BlockRefTable *brtab,
| ^~~~~~~~~~~~~~~~~~
[173/1940 42 8%] Compiling C object src/common/libpgcommon.a.p/blkreftable.c.o
../../../../../home/andres/src/postgresql/src/common/blkreftable.c: In function 'WriteBlockRefTable':
../../../../../home/andres/src/postgresql/src/common/blkreftable.c:474:1: warning: stack usage is 65696 bytes [-Wstack-usage=]
474 | WriteBlockRefTable(BlockRefTable *brtab,
| ^~~~~~~~~~~~~~~~~~
[281/1940 42 14%] Compiling C object src/common/libpgcommon_srv.a.p/blkreftable.c.o
../../../../../home/andres/src/postgresql/src/common/blkreftable.c: In function 'WriteBlockRefTable':
../../../../../home/andres/src/postgresql/src/common/blkreftable.c:474:1: warning: stack usage is 65696 bytes [-Wstack-usage=]
474 | WriteBlockRefTable(BlockRefTable *brtab,
| ^~~~~~~~~~~~~~~~~~
[1311/1940 42 67%] Compiling C object src/bin/pg_basebackup/pg_basebackup.p/pg_basebackup.c.o
../../../../../home/andres/src/postgresql/src/bin/pg_basebackup/pg_basebackup.c: In function 'BaseBackup':
../../../../../home/andres/src/postgresql/src/bin/pg_basebackup/pg_basebackup.c:1753:1: warning: stack usage might be 66976 bytes [-Wstack-usage=]
1753 | BaseBackup(char *compression_algorithm, char *compression_detail,
| ^~~~~~~~~~
[1345/1940 42 69%] Compiling C object src/bin/pg_verifybackup/pg_verifybackup.p/pg_verifybackup.c.o
../../../../../home/andres/src/postgresql/src/bin/pg_verifybackup/pg_verifybackup.c: In function 'verify_file_checksum':
../../../../../home/andres/src/postgresql/src/bin/pg_verifybackup/pg_verifybackup.c:842:1: warning: stack usage is 131232 bytes [-Wstack-usage=]
842 | verify_file_checksum(verifier_context *context, manifest_file *m,
| ^~~~~~~~~~~~~~~~~~~~
[1349/1940 42 69%] Compiling C object src/bin/pg_waldump/pg_waldump.p/pg_waldump.c.o
../../../../../home/andres/src/postgresql/src/bin/pg_waldump/pg_waldump.c: In function 'main':
../../../../../home/andres/src/postgresql/src/bin/pg_waldump/pg_waldump.c:792:1: warning: stack usage might be 105104 bytes [-Wstack-usage=]
792 | main(int argc, char **argv)
| ^~~~
[1563/1940 42 80%] Compiling C object contrib/pg_walinspect/pg_walinspect.so.p/pg_walinspect.c.o
../../../../../home/andres/src/postgresql/contrib/pg_walinspect/pg_walinspect.c: In function 'GetWalStats':
../../../../../home/andres/src/postgresql/contrib/pg_walinspect/pg_walinspect.c:762:1: warning: stack usage is 104624 bytes [-Wstack-usage=]
762 | GetWalStats(FunctionCallInfo fcinfo, XLogRecPtr start_lsn, XLogRecPtr end_lsn,
| ^~~~~~~~~~~
[1581/1940 42 81%] Compiling C object src/test/modules/test_dsa/test_dsa.so.p/test_dsa.c.o
../../../../../home/andres/src/postgresql/src/test/modules/test_dsa/test_dsa.c: In function 'test_dsa_resowners':
../../../../../home/andres/src/postgresql/src/test/modules/test_dsa/test_dsa.c:64:1: warning: stack usage is 80080 bytes [-Wstack-usage=]
64 | test_dsa_resowners(PG_FUNCTION_ARGS)
| ^~~~~~~~~~~~~~~~~~
There is one warning that is just visible in an optimized build,
otherwise the precise amount of stack usage just differs some:
[1165/2017 42 57%] Compiling C object src/backend/postgres_lib.a.p/tsearch_spell.c.o
../../../../../home/andres/src/postgresql/src/backend/tsearch/spell.c: In function 'NIImportAffixes':
../../../../../home/andres/src/postgresql/src/backend/tsearch/spell.c:1425:1: warning: stack usage might be 74080 bytes [-Wstack-usage=]
1425 | NIImportAffixes(IspellDict *Conf, const char *filename)
| ^~~~~~~~~~~~~~~
Warnings in src/bin aren't as interesting as warnings in backend code, as the
latter is much more likely to be "exposed" to deep stacks and could be
vulnerable due to stack overflows.
I did verify this would have warned about d8f5acbdb9b^:
[1/2 1 50%] Compiling C object src/backend/postgres_lib.a.p/backup_basebackup_incremental.c.o
../../../../../home/andres/src/postgresql/src/backend/backup/basebackup_incremental.c: In function 'GetFileBackupMethod':
../../../../../home/andres/src/postgresql/src/backend/backup/basebackup_incremental.c:742:1: warning: stack usage is 524400 bytes [-Wstack-usage=]
742 | GetFileBackupMethod(IncrementalBackupInfo *ib, const char *path,
| ^~~~~~~~~~~~~~~~~~~
I don't really have an opinion about the concrete warning limit to use.
Greetings,
Andres Freund
On 2024-04-11 Th 15:01, Andres Freund wrote:
Hi,
d8f5acbdb9b made me wonder if we should add a compiler option to warn when
stack frames are large. gcc compatible compilers have -Wstack-usage=limit, so
that's not hard.Huge stack frames are somewhat dangerous, as they can defeat our stack-depth
checking logic. There are also some cases where large stack frames defeat
stack-protector logic by compilers/libc/os.It's not always obvious how large the stack will be. Even if you look at all
the sizes of the variables defined in a function, inlining can increase that
substantially.Here are all the cases a limit of 64k finds.
[1345/1940 42 69%] Compiling C object src/bin/pg_verifybackup/pg_verifybackup.p/pg_verifybackup.c.o
../../../../../home/andres/src/postgresql/src/bin/pg_verifybackup/pg_verifybackup.c: In function 'verify_file_checksum':
../../../../../home/andres/src/postgresql/src/bin/pg_verifybackup/pg_verifybackup.c:842:1: warning: stack usage is 131232 bytes [-Wstack-usage=]
842 | verify_file_checksum(verifier_context *context, manifest_file *m,
| ^~~~~~~~~~~~~~~~~~~~
This one's down to me. I asked Robert some time back why we were using a
very conservative buffer size, and he agreed we could probably make it
larger, but the number chosen is mine, not his. It was a completely
arbitrary choice.
I'm happy to reduce it, but it's not clear to me why we care that much
for a client binary. There's no stack depth checking going on here.
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
Hi,
On 2024-04-11 15:16:57 -0400, Andrew Dunstan wrote:
On 2024-04-11 Th 15:01, Andres Freund wrote:
[1345/1940 42 69%] Compiling C object src/bin/pg_verifybackup/pg_verifybackup.p/pg_verifybackup.c.o
../../../../../home/andres/src/postgresql/src/bin/pg_verifybackup/pg_verifybackup.c: In function 'verify_file_checksum':
../../../../../home/andres/src/postgresql/src/bin/pg_verifybackup/pg_verifybackup.c:842:1: warning: stack usage is 131232 bytes [-Wstack-usage=]
842 | verify_file_checksum(verifier_context *context, manifest_file *m,
| ^~~~~~~~~~~~~~~~~~~~This one's down to me. I asked Robert some time back why we were using a
very conservative buffer size, and he agreed we could probably make it
larger, but the number chosen is mine, not his. It was a completely
arbitrary choice.I'm happy to reduce it, but it's not clear to me why we care that much for a
client binary. There's no stack depth checking going on here.
There isn't - but it that doesn't necessarily mean it's great to just use a
lot of stack. It can even mean that you ought to be more careful, because
you'll just crash, rather than get an error about having exceeded the stack
size. When the stack size is increased by more than a few pages, the OS /
compiler defenses for detecting that don't work as well, if you're unlucky.
128k is probably not going to be an issue in practice. However, it also seems
not great from a performance POV to use this much stack in a function that's
called fairly often. I'd allocate the buffer in verify_backup_checksums() and
reuse it across all to-be-checked files.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
d8f5acbdb9b made me wonder if we should add a compiler option to warn when
stack frames are large. gcc compatible compilers have -Wstack-usage=limit, so
that's not hard.
Huge stack frames are somewhat dangerous, as they can defeat our stack-depth
checking logic. There are also some cases where large stack frames defeat
stack-protector logic by compilers/libc/os.
Indeed. I recall reading, not long ago, some Linux kernel docs to the
effect that automatic stack growth is triggered by a reference into
the page just below what is currently mapped as your stack, and
therefore allocating a stack frame greater than one page has the
potential to cause SIGSEGV rather than the desired stack extension.
(If you feel like digging in the archives, I think this was brought
up in the last round of lets-add-some-more-check_stack_depth-calls.)
Warnings in src/bin aren't as interesting as warnings in backend code, as the
latter is much more likely to be "exposed" to deep stacks and could be
vulnerable due to stack overflows.
Probably, but it's still the case that such code is relying on the
original stack allocation being large enough already.
I don't really have an opinion about the concrete warning limit to use.
Given the above, I'm tempted to say we should make it 8K. But I know
we have a bunch of places that allocate page images as stack space,
so maybe that'd be too painful.
regards, tom lane
Hi,
On 2024-04-11 16:35:58 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
d8f5acbdb9b made me wonder if we should add a compiler option to warn when
stack frames are large. gcc compatible compilers have -Wstack-usage=limit, so
that's not hard.Huge stack frames are somewhat dangerous, as they can defeat our stack-depth
checking logic. There are also some cases where large stack frames defeat
stack-protector logic by compilers/libc/os.Indeed. I recall reading, not long ago, some Linux kernel docs to the
effect that automatic stack growth is triggered by a reference into
the page just below what is currently mapped as your stack, and
therefore allocating a stack frame greater than one page has the
potential to cause SIGSEGV rather than the desired stack extension.
(If you feel like digging in the archives, I think this was brought
up in the last round of lets-add-some-more-check_stack_depth-calls.)
I think it's more than a single page, but I'm not entirely sure either. I
think some compilers inject artificial stack accesses when extending the stack
by a lot, but I don't remember the details.
There certainly was the issue that strict memory overcommit does not reliably
work with larger stack extensions.
Could be worth writing a test program for...
I don't really have an opinion about the concrete warning limit to use.
Given the above, I'm tempted to say we should make it 8K.
Hm, why 8k? That's 2x the typical page size, isn't it?
But I know we have a bunch of places that allocate page images as stack
space, so maybe that'd be too painful.
8k does generate a fair number of warnings, 111 here. I think it might also
be hard to ensure that inlining doesn't end up creating bigger stack frames.
frame size warnings
4096 155
8192 111
16384 36
32768 14
65536 8
Suggests that starting somewhere around 16-32k might be reasonable?
One issue is of course that configuring a larger blocksize or wal_blocksize
will trigger more warnings. I guess we'd need to set the limit based on
Max(blocksize, wal_blocksize) * 2 or such.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2024-04-11 16:35:58 -0400, Tom Lane wrote:
Indeed. I recall reading, not long ago, some Linux kernel docs to the
effect that automatic stack growth is triggered by a reference into
the page just below what is currently mapped as your stack, and
therefore allocating a stack frame greater than one page has the
potential to cause SIGSEGV rather than the desired stack extension.
I think it's more than a single page, but I'm not entirely sure either. I
think some compilers inject artificial stack accesses when extending the stack
by a lot, but I don't remember the details.
Hmm. You're right that I was misremembering the typical RAM page
size. The kernel must be allowing stack frames bigger than 4K,
or we'd see problems everywhere. I wonder how much bigger ...
frame size warnings
4096 155
8192 111
16384 36
32768 14
65536 8
Suggests that starting somewhere around 16-32k might be reasonable?
I'm hesitant to touch more than a handful of places on the strength
of the info we've got; either it's wasted work or it's not enough
work, and we don't know which. Might be time for some
experimentation.
regards, tom lane
Hi,
On 2024-04-11 15:07:11 -0700, Andres Freund wrote:
On 2024-04-11 16:35:58 -0400, Tom Lane wrote:
Indeed. I recall reading, not long ago, some Linux kernel docs to the
effect that automatic stack growth is triggered by a reference into
the page just below what is currently mapped as your stack, and
therefore allocating a stack frame greater than one page has the
potential to cause SIGSEGV rather than the desired stack extension.
(If you feel like digging in the archives, I think this was brought
up in the last round of lets-add-some-more-check_stack_depth-calls.)I think it's more than a single page, but I'm not entirely sure either. I
think some compilers inject artificial stack accesses when extending the stack
by a lot, but I don't remember the details.There certainly was the issue that strict memory overcommit does not reliably
work with larger stack extensions.Could be worth writing a test program for...
It looks like it's a mess.
In the good cases the kernel doesn't map anything within ulimit -R of the
stack, and the stack is extended whenever memory in that range is accessed.
Nothing is mapped into that region unless MAP_FIXED is used.
However, in some cases linux maps the heap and the stack fairly close to each
other at program startup. I've observed this with an executable compiled with
-static-pie and executed with randomization disabled (via setarch -R). In
that case the the layout at program start is
...
7ffff7fff000-7ffff8021000 rw-p 00000000 00:00 0 [heap]
7ffffffdd000-7ffffffff000 rw-p 00000000 00:00 0 [stack]
Here the start of the heap and the end of the stack are only 128MB appart. The
heap grows upwards, the stack downwards.
Which means that if glibc allocates a bunch of memory via sbrk() and the stack
grows, they clash into each other.
I think this may be a glibc bug. If I compile with musl instead, this doesn't
happen, because musl stops using sbrk() style allocations before stack and
program break get too close to each other.
Greetings,
Andres Freund
On 2024-04-11 Th 16:17, Andres Freund wrote:
128k is probably not going to be an issue in practice. However, it also seems
not great from a performance POV to use this much stack in a function that's
called fairly often. I'd allocate the buffer in verify_backup_checksums() and
reuse it across all to-be-checked files.
Yes, I agree. I'll make that happen in the next day or two.
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com