Detecting use-after-free bugs using gcc's malloc() attribute

Started by Andres Freundalmost 3 years ago4 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

I played around with adding
__attribute__((malloc(free_func), malloc(another_free_func)))
annotations to a few functions in pg. See
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html

Adding them to pg_list.h seems to have found two valid issues when compiling
without optimization:

[1001/2331 22 42%] Compiling C object src/backend/postgres_lib.a.p/commands_tablecmds.c.o
../../../../home/andres/src/postgresql/src/backend/commands/tablecmds.c: In function ‘ATExecAttachPartition’:
../../../../home/andres/src/postgresql/src/backend/commands/tablecmds.c:17966:25: warning: pointer ‘partBoundConstraint’ may be used after ‘list_concat’ [-Wuse-after-free]
17966 | get_proposed_default_constraint(partBoundConstraint);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../../home/andres/src/postgresql/src/backend/commands/tablecmds.c:17919:26: note: call to ‘list_concat’ here
17919 | partConstraint = list_concat(partBoundConstraint,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17920 | RelationGetPartitionQual(rel));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[1233/2331 22 52%] Compiling C object src/backend/postgres_lib.a.p/rewrite_rewriteHandler.c.o
../../../../home/andres/src/postgresql/src/backend/rewrite/rewriteHandler.c: In function ‘rewriteRuleAction’:
../../../../home/andres/src/postgresql/src/backend/rewrite/rewriteHandler.c:550:41: warning: pointer ‘newjointree’ may be used after ‘list_concat’ [-Wuse-after-free]
550 | checkExprHasSubLink((Node *) newjointree);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../../home/andres/src/postgresql/src/backend/rewrite/rewriteHandler.c:542:33: note: call to ‘list_concat’ here
542 | list_concat(newjointree, sub_action->jointree->fromlist);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Both of these bugs seem somewhat older, the latter going back to 19ff959bff0,
in 2005. I'm a bit surprised that we haven't hit them before, via
DEBUG_LIST_MEMORY_USAGE?

When compiling with optimization, another issue is reported:

[508/1814 22 28%] Compiling C object src/backend/postgres_lib.a.p/commands_explain.c.o
../../../../home/andres/src/postgresql/src/backend/commands/explain.c: In function 'ExplainNode':
../../../../home/andres/src/postgresql/src/backend/commands/explain.c:2037:25: warning: pointer 'ancestors' may be used after 'lcons' [-Wuse-after-free]
2037 | show_upper_qual(plan->qual, "Filter", planstate, ancestors, es);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function 'show_group_keys',
inlined from 'ExplainNode' at ../../../../home/andres/src/postgresql/src/backend/commands/explain.c:2036:4:
../../../../home/andres/src/postgresql/src/backend/commands/explain.c:2564:21: note: call to 'lcons' here
2564 | ancestors = lcons(plan, ancestors);
| ^~~~~~~~~~~~~~~~~~~~~~

which looks like it might be valid - the caller's "ancestors" variable could
now be freed? There do appear to be further instances of the issue, e.g. in
show_agg_keys(), that aren't flagged for some reason.

For something like pg_list.h the malloc(free) attribute is a bit awkward to
use, because one a) needs to list ~30 functions that can free a list and b)
the referenced functions need to be declared.

In my quick hack I just duplicated the relevant part of pg_list.h and added
the appropriate attributes to the copy of the original declaration.

I also added such attributes to bitmapset.h and palloc() et al, but that
didn't find existing bugs. It does find use-after-free instances if I add
some, similarly it does find cases of mismatching palloc with free etc.

The attached prototype is quite rough and will likely fail on anything but a
recent gcc (likely >= 12).

Do others think this would be useful enough to be worth polishing? And do you
agree the warnings above are bugs?

Greetings,

Andres Freund

Attachments:

v1-0001-Add-allocator-attributes.patchtext/x-diff; charset=us-asciiDownload+167-35
#2Peter Eisentraut
peter_e@gmx.net
In reply to: Andres Freund (#1)
Re: Detecting use-after-free bugs using gcc's malloc() attribute

On 26.06.23 21:54, Andres Freund wrote:

For something like pg_list.h the malloc(free) attribute is a bit awkward to
use, because one a) needs to list ~30 functions that can free a list and b)
the referenced functions need to be declared.

Hmm. Saying list_concat() "deallocates" a list is mighty confusing
because 1) it doesn't, and 2) it might actually allocate a new list. So
while you get the useful behavior of "you probably didn't mean to use
this variable again after passing it into list_concat()", if some other
tool actually took these allocate/deallocate decorations at face value
and did a memory leak analysis with them, they'd get completely bogus
results.

I also added such attributes to bitmapset.h and palloc() et al, but that
didn't find existing bugs. It does find use-after-free instances if I add
some, similarly it does find cases of mismatching palloc with free etc.

This seems more straightforward. Even if it didn't find any bugs, I'd
imagine it would be useful during development.

Do others think this would be useful enough to be worth polishing? And do you
agree the warnings above are bugs?

I actually just played with this the other day, because I can never
remember termPQExpBuffer() vs. destroyPQExpBuffer(). I couldn't quite
make it work for that, but I found the feature overall useful, so I'd
welcome support for it.

#3Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#2)
Re: Detecting use-after-free bugs using gcc's malloc() attribute

Hi,

On 2023-06-28 10:40:22 +0200, Peter Eisentraut wrote:

On 26.06.23 21:54, Andres Freund wrote:

For something like pg_list.h the malloc(free) attribute is a bit awkward to
use, because one a) needs to list ~30 functions that can free a list and b)
the referenced functions need to be declared.

Hmm. Saying list_concat() "deallocates" a list is mighty confusing because
1) it doesn't, and 2) it might actually allocate a new list.

list_concat() basically behaves like realloc(), except that the "pointer is
still valid" case is much more common. And the way that's modelled in the
annotations is to say a function frees and allocates.

Note that the free attribute references the first element for list_concat(),
not the second.

So while you get the useful behavior of "you probably didn't mean to use
this variable again after passing it into list_concat()", if some other tool
actually took these allocate/deallocate decorations at face value and did a
memory leak analysis with them, they'd get completely bogus results.

How would the annotations possibly lead to a bogus result? I see neither how
it could lead to false negatives nor false positives?

The gcc attributes are explicitly intended to track not just plain memory
allocations, the example in the docs
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
is to add them for fopen() etc. So I don't think it's likely that external
tools will interpret this is a much more stringent way.

I also added such attributes to bitmapset.h and palloc() et al, but that
didn't find existing bugs. It does find use-after-free instances if I add
some, similarly it does find cases of mismatching palloc with free etc.

This seems more straightforward. Even if it didn't find any bugs, I'd
imagine it would be useful during development.

Agreed. Given our testing regimen (valgrind etc), I'd expect to find many such
bugs before long in the tree anyway. But it's much nicer to get that far. And
to find paths that aren't covered by tests.

Do others think this would be useful enough to be worth polishing? And do you
agree the warnings above are bugs?

I actually just played with this the other day, because I can never remember
termPQExpBuffer() vs. destroyPQExpBuffer().

That's a pretty nasty one :(

I couldn't quite make it work for that, but I found the feature overall
useful, so I'd welcome support for it.

Yea, I don't think the attributes can comfortable handle initPQExpBuffer()
style allocation. It's somewhat posible by moving the allocation to an inline
function, and then making the thing that's allocated ->data. But it ends up
pretty messy, particularly because we need ABI stability for pqexpbuffer.h.

But createPQExpBuffer() can be dealt with reasonably.

Doing so points out:

[51/354 42 14%] Compiling C object src/bin/initdb/initdb.p/initdb.c.o
../../../../home/andres/src/postgresql/src/bin/initdb/initdb.c: In function ‘replace_guc_value’:
../../../../home/andres/src/postgresql/src/bin/initdb/initdb.c:566:9: warning: ‘free’ called on pointer returned from a mismatched allocation function [-Wmismatched-dealloc]
566 | free(newline); /* but don't free newline->data */
| ^~~~~~~~~~~~~
../../../../home/andres/src/postgresql/src/bin/initdb/initdb.c:470:31: note: returned from ‘createPQExpBuffer’
470 | PQExpBuffer newline = createPQExpBuffer();
| ^~~~~~~~~~~~~~~~~~~

which is intentional, but ... not pretty, and could very well be a bug in
other cases. If we want to do stuff like that, we'd probably better off
having a dedicated version of destroyPQExpBuffer(). Although here it looks
like the code should just use an on-stack PQExpBuffer.

Greetings,

Andres Freund

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Andres Freund (#3)
Re: Detecting use-after-free bugs using gcc's malloc() attribute

On 28.06.23 20:15, Andres Freund wrote:

On 2023-06-28 10:40:22 +0200, Peter Eisentraut wrote:

On 26.06.23 21:54, Andres Freund wrote:

For something like pg_list.h the malloc(free) attribute is a bit awkward to
use, because one a) needs to list ~30 functions that can free a list and b)
the referenced functions need to be declared.

Hmm. Saying list_concat() "deallocates" a list is mighty confusing because
1) it doesn't, and 2) it might actually allocate a new list.

list_concat() basically behaves like realloc(), except that the "pointer is
still valid" case is much more common. And the way that's modelled in the
annotations is to say a function frees and allocates.

Note that the free attribute references the first element for list_concat(),
not the second.

Yeah, I think that would be ok. I was worried about the cases where it
doesn't actually free the first argument, but in all those cases it
passes it as a result, so as far as a caller is concerned, it would
appear as freed and allocated, even if it's really the same.