Missing checks when malloc returns NULL...
Hi all,
While auditing the code, I got surprised that there are a couple of
code paths that do nothing for this error handling:
- pg_regress and isolationtester use malloc extensively, in case of
failure those would just crash crash. I think that it matters for
buildfarm members that are under memory pressure to not do so, so
those should use pg_malloc instead.
- refint.c makes use of malloc to store plans in top memory context.
That's a buggy concept clearly... This code would need to be reworked
more largely than in the patch I attach.
- pg_dlsym for darwin uses malloc, but would crash on failure
- ps_status.c does nothing when it uses malloc().
- sprompt.c uses malloc once, and would crash on failure
- mcxt.c uses that, which is surprising:
@@ -704,7 +704,8 @@ MemoryContextCreate(NodeTag tag, Size size,
{
/* Special case for startup: use good ol' malloc */
node = (MemoryContext) malloc(needed);
- Assert(node != NULL);
+ if (node == NULL)
+ elog(PANIC, "out of memory");
}
I think that a PANIC is cleaner here instead of a simple crash.
So attached is a patch aimed at improving things. Thoughts?
--
Michael
Attachments:
malloc-nulls.patchinvalid/octet-stream; name=malloc-nulls.patchDownload+33-11
Michael Paquier <michael.paquier@gmail.com> writes:
- mcxt.c uses that, which is surprising: @@ -704,7 +704,8 @@ MemoryContextCreate(NodeTag tag, Size size, { /* Special case for startup: use good ol' malloc */ node = (MemoryContext) malloc(needed); - Assert(node != NULL); + if (node == NULL) + elog(PANIC, "out of memory"); } I think that a PANIC is cleaner here instead of a simple crash.
But the elog mechanism assumes that memory contexts are working.
If this ever actually did fire, no good would come of it.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jun 21, 2016 at 10:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
- mcxt.c uses that, which is surprising: @@ -704,7 +704,8 @@ MemoryContextCreate(NodeTag tag, Size size, { /* Special case for startup: use good ol' malloc */ node = (MemoryContext) malloc(needed); - Assert(node != NULL); + if (node == NULL) + elog(PANIC, "out of memory"); } I think that a PANIC is cleaner here instead of a simple crash.But the elog mechanism assumes that memory contexts are working.
If this ever actually did fire, no good would come of it.
OK, there is not much that we can do here then. What about the rest?
Those seem like legit concerns to me.
--
Michael
Attachments:
malloc-nulls-v2.patchinvalid/octet-stream; name=malloc-nulls-v2.patchDownload+31-10
On Wed, Jun 22, 2016 at 10:41 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
OK, there is not much that we can do here then. What about the rest?
Those seem like legit concerns to me.
Registered this one to the next CF as a bug fix:
https://commitfest.postgresql.org/10/653/
It would be better not to crash if we can avoid it.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 06/22/2016 04:41 AM, Michael Paquier wrote:
On Tue, Jun 21, 2016 at 10:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
- mcxt.c uses that, which is surprising: @@ -704,7 +704,8 @@ MemoryContextCreate(NodeTag tag, Size size, { /* Special case for startup: use good ol' malloc */ node = (MemoryContext) malloc(needed); - Assert(node != NULL); + if (node == NULL) + elog(PANIC, "out of memory"); } I think that a PANIC is cleaner here instead of a simple crash.But the elog mechanism assumes that memory contexts are working.
If this ever actually did fire, no good would come of it.make s
OK, there is not much that we can do here then. What about the rest?
Those seem like legit concerns to me.
There's also a realloc() and an strdup() call in refint.c. But looking
at refint.c, I wonder why it's using malloc()/free() in the first place,
rather than e.g. TopMemoryContext. The string construction code with
sprintf() and strlen() looks quite antiqued, too, StringInfo would make
it more readable.
Does refint.c actually have any value anymore? What if we just removed
it altogether? It's good to have some C triggers in contrib, to serve as
examples, and to have some regression test coverage for all that. But
ISTM that the 'autoinc' module covers the trigger API just as well.
The code in ps_status() runs before the elog machinery has been
initialized, so you get a rather unhelpful error:
error occurred at ps_status.c:167 before error message processing is available
I guess that's still better than outright crashing, though. There are
also a few strdup() calls there that can run out of memory..
Not many of the callers of simple_prompt() check for a NULL result, so
in all probability, returning NULL from there will just crash in the
caller. There's one existing "return NULL" in there already, so this
isn't a new problem, but can we find a better way?
There are more malloc(), realloc() and strdup() calls in isolationtester
and pg_regress, that we ought to change too while we're at it.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 18, 2016 at 6:12 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 06/22/2016 04:41 AM, Michael Paquier wrote:
make s
OK, there is not much that we can do here then. What about the rest?
Those seem like legit concerns to me.There's also a realloc() and an strdup() call in refint.c. But looking at
refint.c, I wonder why it's using malloc()/free() in the first place, rather
than e.g. TopMemoryContext. The string construction code with sprintf() and
strlen() looks quite antiqued, too, StringInfo would make it more readable.Does refint.c actually have any value anymore? What if we just removed it
altogether? It's good to have some C triggers in contrib, to serve as
examples, and to have some regression test coverage for all that. But ISTM
that the 'autoinc' module covers the trigger API just as well.
I'd be in favor for nuking it instead of keeping this code as it
overlaps with autoinc, I did not notice that. Even if that's an
example, it is actually showing some bad code patters, which kills its
own purpose.
The code in ps_status() runs before the elog machinery has been initialized,
so you get a rather unhelpful error:error occurred at ps_status.c:167 before error message processing is
availableI guess that's still better than outright crashing, though. There are also a
few strdup() calls there that can run out of memory..
Right. Another possibility is to directly call write_stderr to be sure
that the user gets the right message, then do exit(1). Thinking more
about it, as save_ps_display_args is called really at the beginning
this is perhaps what makes the most sense, so I changed the patch this
way.
Not many of the callers of simple_prompt() check for a NULL result, so in
all probability, returning NULL from there will just crash in the caller.
There's one existing "return NULL" in there already, so this isn't a new
problem, but can we find a better way?
I got inspired by the return value in the case of WIN32. Letting
sprompt.c in charge of printing an error does not seem like a good
idea to me, and there are already callers of simple_prompt() that
check for NULL and report an error appropriately, like pg_backup_db.c.
So I think that we had better adapt all the existing calls of
simple_prompt checking for NULL and make things more consistent by
having the callers report errors. Hence I updated the patch with those
changes.
Another possibility would be to keep a static buffer that has a fixed
size, basically 4096 as this is the maximum expected by psql, but
that's a waste of bytes in all other cases: one caller uses 4096, two
128 and the rest use 100 or less.
By the way, while looking at that, I also noticed that in exec_command
of psql's command.c we don't check for gets_fromFile that could return
NULL.
There are more malloc(), realloc() and strdup() calls in isolationtester and
pg_regress, that we ought to change too while we're at it.
Right. I added some handling for those callers as well with the pg_ equivalents.
--
Michael
Attachments:
malloc-nulls-v3.patchtext/x-patch; charset=US-ASCII; name=malloc-nulls-v3.patchDownload+199-50
On Sat, Aug 27, 2016 at 10:24 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
./src/backend/postmaster/postmaster.c: userDoption =
strdup(optarg);
[...]
./src/backend/bootstrap/bootstrap.c: userDoption =
strdup(optarg);
[...]
./src/backend/tcop/postgres.c: userDoption = strdup(optarg);
We cannot use pstrdup here because memory contexts are not set up
here, still it would be better than crashing, but I am not sure if
that's worth complicating this code.. Other opinions are welcome.
Those are not changed. We could have some NULL-checks but I am not
sure that's worth the complication here.
./contrib/vacuumlo/vacuumlo.c: param.pg_user = strdup(optarg);
[...]
./contrib/pg_standby/pg_standby.c: triggerPath = strdup(optarg);
[...]
./src/bin/pg_archivecleanup/pg_archivecleanup.c:
additional_ext = strdup(optarg); /* Extension to remove
Right we can do something here with pstrdup().
Those are updated with pg_strdup().
./contrib/spi/refint.c: plan->splan = (SPIPlanPtr *)
malloc(sizeof(SPIPlanPtr));
Regarding refint.c, you can see upthread. Instead of reworking the
code it would be better to just drop it from the tree.
I'd rather see this nuked out of the surface of the code tree.
./src/backend/utils/adt/pg_locale.c: grouping = strdup(extlconv->grouping);
Here that would be a failure with an elog() as this is getting used by
things like NUM_TOCHAR_finish and PGLC_localeconv.
Done.
./src/pl/tcl/pltcl.c: prodesc->user_proname =
strdup(NameStr(procStruct->proname));
./src/pl/tcl/pltcl.c: prodesc->internal_proname =
strdup(internal_proname);
./src/pl/tcl/pltcl.c- if (prodesc->user_proname == NULL ||
prodesc->internal_proname == NULL)
./src/pl/tcl/pltcl.c- ereport(ERROR,
./src/pl/tcl/pltcl.c- (errcode(ERRCODE_OUT_OF_MEMORY),
./src/pl/tcl/pltcl.c- errmsg("out of memory")));
Ah, yes. Here we need to do a free(prodesc) first.
Done.
./src/common/exec.c: putenv(strdup(env_path));
set_pglocale_pgservice() is used at the beginning of each process run,
meaning that a failure here would be printf(stderr), followed by
exit() for frontend, even ECPG as this compiles with -DFRONTEND.
Backend can use elog(ERROR) btw.
Done.
Regarding the handling of strdup in libpq the code is careful in its
handling after a review, so we had better do nothing.
After that, there are two calls to realloc and one call to malloc that
deserve some attention, though those happen in pgc.l and I am not
exactly sure how to handle them.
As Alexander's email
(/messages/by-id/20160826153358.GA29981@e733)
has broken this thread, I am attaching to this thread the analysis
report that has been generated by Alexander previously. It was
referenced in only an URL.
Attached is an updated patch addressing those extra points.
--
Michael
Hello, Michael
I don't know how you did it, but this email has visibly broken the
original thread. Did you change the topic name?
I'm very sorry for this. I had no local copy of this thread. So I wrote a
new email with the same subject hoping it will be OK. Apparently right
In-Reply-To header is also required.
if (prodesc->user_proname == NULL || prodesc->internal_proname == NULL) + { + free(prodesc);
I think that prodesc->user_proname and prodesc->internal_proname should
also be freed if they are not NULL's.
By the way maybe someone knows other procedures besides malloc, realloc
and strdup that require special attention?
I recalled that there is also calloc(). I've found four places that use
calloc() and look suspicious to me (see attachment). What do you think -
are these bugs or not?
--
Best regards,
Aleksander Alekseev
Attachments:
calloc.txttext/plain; charset=us-asciiDownload
Hello, Michael
I don't know how you did it, but this email has visibly broken the
original thread. Did you change the topic name?I'm very sorry for this. I had no local copy of this thread. So I wrote a
new email with the same subject hoping it will be OK. Apparently right
In-Reply-To header is also required.if (prodesc->user_proname == NULL || prodesc->internal_proname == NULL) + { + free(prodesc);I think that prodesc->user_proname and prodesc->internal_proname should
also be freed if they are not NULL's.By the way maybe someone knows other procedures besides malloc, realloc
and strdup that require special attention?I recalled that there is also calloc(). I've found four places that use
calloc() and look suspicious to me (see attachment). What do you think -
are these bugs or not?
I've just realized that there is also malloc-compatible ShmemAlloc().
Apparently it's return value sometimes is not properly checked too. See
attachment.
--
Best regards,
Aleksander Alekseev
Attachments:
shmem.txttext/plain; charset=us-asciiDownload
Aleksander Alekseev <a.alekseev@postgrespro.ru> writes:
if (prodesc->user_proname == NULL || prodesc->internal_proname == NULL) + { + free(prodesc);
I think that prodesc->user_proname and prodesc->internal_proname should
also be freed if they are not NULL's.
Hmm, this is kind of putting lipstick on a pig, isn't it? That code
is still prone to leakage further down, because it calls stuff like
SearchSysCache which is entirely capable of throwing elog(ERROR).
If we're going to touch compile_pltcl_function at all, I'd vote for
(1) changing these malloc calls to MemoryContextAlloc(TopMemoryContext,...
(2) putting the cleanup into a PG_CATCH block, and removing all the
retail free() calls that are there now.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Aug 29, 2016 at 11:13 PM, Aleksander Alekseev
<a.alekseev@postgrespro.ru> wrote:
Hello, Michael
I don't know how you did it, but this email has visibly broken the
original thread. Did you change the topic name?I'm very sorry for this. I had no local copy of this thread. So I wrote a
new email with the same subject hoping it will be OK. Apparently right
In-Reply-To header is also required.if (prodesc->user_proname == NULL || prodesc->internal_proname == NULL) + { + free(prodesc);I think that prodesc->user_proname and prodesc->internal_proname should
also be freed if they are not NULL's.By the way maybe someone knows other procedures besides malloc, realloc
and strdup that require special attention?I recalled that there is also calloc(). I've found four places that use
calloc() and look suspicious to me (see attachment). What do you think -
are these bugs or not?
./src/backend/storage/buffer/localbuf.c: LocalBufferBlockPointers =
(Block *) calloc(nbufs, sizeof(Block));
./src/interfaces/libpq/fe-print.c- fprintf(stderr,
libpq_gettext("out of memory\n"));
Here it does not matter the process is taken down with FATAL or
abort() immediately.
./src/backend/bootstrap/bootstrap.c: app = Typ =
ALLOC(struct typmap *, i + 1);
But here it does actually matter.
I've just realized that there is also malloc-compatible ShmemAlloc().
Apparently it's return value sometimes is not properly checked too. See
attachment.
./src/backend/storage/lmgr/proc.c: pgxacts = (PGXACT *)
ShmemAlloc(TotalProcs * sizeof(PGXACT));
./src/backend/storage/lmgr/proc.c: ProcStructLock = (slock_t *)
ShmemAlloc(sizeof(slock_t));
./src/backend/storage/lmgr/lwlock.c: ptr = (char *)
ShmemAlloc(spaceLocks);
./src/backend/storage/ipc/shmem.c: ShmemAlloc(sizeof(*ShmemVariableCache));
./src/backend/access/transam/slru.c: shared->buffer_locks =
(LWLockPadded *) ShmemAlloc(sizeof(LWLockPadded) * nslots);
./src/backend/postmaster/postmaster.c: ShmemBackendArray = (Backend
*) ShmemAlloc(size);
The funny part here is that ProcGlobal->allProcs is actually handled,
but not the two others. Well yes, you are right, we really need to
fail on FATAL for all of them if ShmemAlloc returns NULL as they
involve the shmem initialization at postmaster startup.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Aug 29, 2016 at 11:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Aleksander Alekseev <a.alekseev@postgrespro.ru> writes:
if (prodesc->user_proname == NULL || prodesc->internal_proname == NULL) + { + free(prodesc);I think that prodesc->user_proname and prodesc->internal_proname should
also be freed if they are not NULL's.Hmm, this is kind of putting lipstick on a pig, isn't it? That code
is still prone to leakage further down, because it calls stuff like
SearchSysCache which is entirely capable of throwing elog(ERROR).If we're going to touch compile_pltcl_function at all, I'd vote for
(1) changing these malloc calls to MemoryContextAlloc(TopMemoryContext,...
(2) putting the cleanup into a PG_CATCH block, and removing all the
retail free() calls that are there now.
We've done similar handling lately with for example 8c75ad43 for
plpython, and this has finished by using TopMemoryContext, so that's
the way to do. By the way plperl needs the same cleanup, and by
looking at the archives guess who did exactly that with a set of
patches not long ago:
/messages/by-id/CAB7nPqRxvq+Q66UFzD9wa5UAftYN4WAUADbjXKFrync96kf-VQ@mail.gmail.com
But I did not get much feedback about those patches :)
So for now I have removed this part from the patch of this thread.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Aug 30, 2016 at 2:57 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
The funny part here is that ProcGlobal->allProcs is actually handled,
but not the two others. Well yes, you are right, we really need to
fail on FATAL for all of them if ShmemAlloc returns NULL as they
involve the shmem initialization at postmaster startup.
And with an actual patch things are better.
--
Michael
Attachments:
malloc-nulls-v5.patchapplication/x-patch; name=malloc-nulls-v5.patchDownload+279-59
The funny part here is that ProcGlobal->allProcs is actually handled,
but not the two others. Well yes, you are right, we really need to
fail on FATAL for all of them if ShmemAlloc returns NULL as they
involve the shmem initialization at postmaster startup.And with an actual patch things are better.
Currently I can't think of any further improvements. I even would dare
to say that patch is Ready for Committer.
--
Best regards,
Aleksander Alekseev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Aug 30, 2016 at 5:08 PM, Aleksander Alekseev
<a.alekseev@postgrespro.ru> wrote:
The funny part here is that ProcGlobal->allProcs is actually handled,
but not the two others. Well yes, you are right, we really need to
fail on FATAL for all of them if ShmemAlloc returns NULL as they
involve the shmem initialization at postmaster startup.And with an actual patch things are better.
Currently I can't think of any further improvements. I even would dare
to say that patch is Ready for Committer.
Thanks for the fruitful input by the way! You spotted many things.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
And with an actual patch things are better.
Currently I can't think of any further improvements. I even would dare
to say that patch is Ready for Committer.Thanks for the fruitful input by the way! You spotted many things.
Thank _you_ for paying attention for such issues in PostgreSQL code!
--
Best regards,
Aleksander Alekseev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier <michael.paquier@gmail.com> writes:
I've just realized that there is also malloc-compatible ShmemAlloc().
Apparently it's return value sometimes is not properly checked too. See
attachment.
The funny part here is that ProcGlobal->allProcs is actually handled,
but not the two others. Well yes, you are right, we really need to
fail on FATAL for all of them if ShmemAlloc returns NULL as they
involve the shmem initialization at postmaster startup.
A quick scan says that most callers are failing to check at all,
several more just have duplicate check-and-immediately-elog code,
and only one has got anything useful to do besides throwing error.
I think what we ought to do is make ShmemAlloc act like palloc
(ie throw error not return NULL), and remove the duplicated error
checks. For the one caller that that would be bad for, we could
invent something like ShmemAllocNoError, or ShmemAllocExtended with
a no_error flag, or whatever other new API suits your fancy. But
as-is, it's just inviting more errors-of-omission like the large
number that already exist. I think people are conditioned by palloc
to expect such functions to throw error.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Aug 30, 2016 at 10:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think what we ought to do is make ShmemAlloc act like palloc
(ie throw error not return NULL), and remove the duplicated error
checks. For the one caller that that would be bad for, we could
invent something like ShmemAllocNoError, or ShmemAllocExtended with
a no_error flag, or whatever other new API suits your fancy. But
as-is, it's just inviting more errors-of-omission like the large
number that already exist. I think people are conditioned by palloc
to expect such functions to throw error.
The only reason why I did not propose that for ShmemAlloc is because
of extensions potentially using this routine and having some special
handling when it returns NULL. And changing it to behave like palloc
would break such extensions.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I think what we ought to do is make ShmemAlloc act like palloc
(ie throw error not return NULL), and remove the duplicated error
checks. For the one caller that that would be bad for, we could
invent something like ShmemAllocNoError, or ShmemAllocExtended with
a no_error flag, or whatever other new API suits your fancy. But
as-is, it's just inviting more errors-of-omission like the large
number that already exist. I think people are conditioned by palloc
to expect such functions to throw error.The only reason why I did not propose that for ShmemAlloc is because
of extensions potentially using this routine and having some special
handling when it returns NULL. And changing it to behave like palloc
would break such extensions.
I suggest to keep ShmemAlloc as is for backward compatibility and
introduce a new procedure ShmemAllocSafe. Also we could add a scary
comment (and also a warning log message?) that ShmemAlloc is deprecated
and possibly will be removed in later releases.
--
Best regards,
Aleksander Alekseev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier <michael.paquier@gmail.com> writes:
On Tue, Aug 30, 2016 at 10:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think what we ought to do is make ShmemAlloc act like palloc
(ie throw error not return NULL), and remove the duplicated error
checks.
The only reason why I did not propose that for ShmemAlloc is because
of extensions potentially using this routine and having some special
handling when it returns NULL. And changing it to behave like palloc
would break such extensions.
The evidence from the callers in core suggests that this change
would be much more likely to fix extensions than break them,
ie it's more likely that they are missing error checks than that
they have something useful to do if the alloc fails.
An extension that actually does need that could do something like
#if CATALOG_VERSION_NO < whatever-v10-is
#define ShmemAllocNoError(x) ShmemAlloc(x)
#endif
...
ptr = ShmemAllocNoError(size);
if (ptr == NULL) // same as before from here on
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers