Oops in fe-auth.c
I've been debugging some really weird crashes in libpq on win32, and I
think I've finally found the reason for the heap corruption that shows up
in msvc debug mode.
When run in debug mode, the runtime for msvc will *zero-pad the entire
buffer* in a strncpy() call. This in itself is not bad (just slow), but it
shows a rather bad bug in libpq.
In a bunch of places in fe-auth.c, we do:
strncpy(PQerrormsg, libpq_gettext("out of memory\n"), PQERRORMSG_LENGTH);
Except when calling it, the size of the buffer is 256 bytes. But
PQERRORMSG_LENGTH is 1024.
Naturally, this causes a heap corruption. It doesn't happen in production,
because the string length fits as long as there is no padding.
One way to get around this on win32 is to just use snprintf() instead of
strncpy(), since it doesn't pad. But that's just hiding the underlying
problem, so I think that's a really bad fix.
I assume the comment in the header:
* NOTE: the error message strings returned by this module must not
* exceed INITIAL_EXPBUFFER_SIZE (currently 256 bytes).
refers to this, but it's hard to guarantee that from the code since it's
translated strings.
I see a comment in fe-connect.c that has
* XXX fe-auth.c has not been fixed to support PQExpBuffers,
Given this, I'll go ahead and fix fe-connect to support PQExpBuffers,
unless there are any objections. Also, is this something we shuold
backpatch - or just ignore since we've had no actual reports of it in the
field?
//Magnus
On Mon, Jul 23, 2007 at 03:36:21PM +0200, Magnus Hagander wrote:
I've been debugging some really weird crashes in libpq on win32, and I
think I've finally found the reason for the heap corruption that shows up
in msvc debug mode.When run in debug mode, the runtime for msvc will *zero-pad the entire
buffer* in a strncpy() call. This in itself is not bad (just slow), but it
shows a rather bad bug in libpq.In a bunch of places in fe-auth.c, we do:
strncpy(PQerrormsg, libpq_gettext("out of memory\n"), PQERRORMSG_LENGTH);Except when calling it, the size of the buffer is 256 bytes. But
PQERRORMSG_LENGTH is 1024.Naturally, this causes a heap corruption. It doesn't happen in production,
because the string length fits as long as there is no padding.One way to get around this on win32 is to just use snprintf() instead of
strncpy(), since it doesn't pad. But that's just hiding the underlying
problem, so I think that's a really bad fix.I assume the comment in the header:
* NOTE: the error message strings returned by this module must not
* exceed INITIAL_EXPBUFFER_SIZE (currently 256 bytes).refers to this, but it's hard to guarantee that from the code since it's
translated strings.I see a comment in fe-connect.c that has
* XXX fe-auth.c has not been fixed to support PQExpBuffers,Given this, I'll go ahead and fix fe-connect to support PQExpBuffers,
unless there are any objections. Also, is this something we shuold
backpatch - or just ignore since we've had no actual reports of it in the
field?
Actually coding up a patch for that was just a bunch of simple
search/replace ops. Attached is one that appears to work fine for me.
Actually coding up a patch for that was just a bunch of simple
search/replace ops. Attached is one that appears to work fine for me.
Was there any reason why this wasn't done before, or just nobody had the
time? If there was a reason, please let me know what it was :-) Otherwise
I'll apply this patch to fix it.
(Question about backpatch remains)
//Magnus
Attachments:
libpq.difftext/plain; charset=us-asciiDownload+157-161
Magnus Hagander <magnus@hagander.net> writes:
When run in debug mode, the runtime for msvc will *zero-pad the entire
buffer* in a strncpy() call. This in itself is not bad (just slow), but it
shows a rather bad bug in libpq.
[squint] That is the specified behavior of strncpy on every platform,
not only msvc. If there's a bug here why didn't we notice it long ago?
Given this, I'll go ahead and fix fe-connect to support PQExpBuffers,
unless there are any objections.
I'm not against that, but I question what bug you've really found.
regards, tom lane
On Mon, Jul 23, 2007 at 10:28:57AM -0400, Tom Lane wrote:
Magnus Hagander <magnus@hagander.net> writes:
When run in debug mode, the runtime for msvc will *zero-pad the entire
buffer* in a strncpy() call. This in itself is not bad (just slow), but it
shows a rather bad bug in libpq.[squint] That is the specified behavior of strncpy on every platform,
not only msvc. If there's a bug here why didn't we notice it long ago?
Hmm. Interesting - I see that now if I look at
http://www.opengroup.org/onlinepubs/007908799/xsh/strncpy.html.
That's very interesting - but my debugger very much shows me that the
buffer size is 256 bytes (INITIAL_EXPBUFFER_SIZE), and passes
1024 (PQERRORMSG_LENGTH) as the size of the buffer...
Perhaps we've just never hit one of those codepaths before. Previously, it
was only used for out of memory errors - the gssapi code adds a few places
where it's used in other cases, and this is where it crashed for me.
Given this, I'll go ahead and fix fe-connect to support PQExpBuffers,
unless there are any objections.I'm not against that, but I question what bug you've really found.
I never actually tested if it crashes on mingw, but looking some more at it
it really should - once one of these errors happen.
//Magnus
Magnus Hagander wrote:
On Mon, Jul 23, 2007 at 10:28:57AM -0400, Tom Lane wrote:
Given this, I'll go ahead and fix fe-connect to support PQExpBuffers,
unless there are any objections.I'm not against that, but I question what bug you've really found.
I never actually tested if it crashes on mingw, but looking some more at it
it really should - once one of these errors happen.
Hm. Much easier than that - the code is new in HEAD. 8.2 did
fprintf(stderr). And HEAD still does that in at least one case.
Anyway, I'll go ahead with the patch I wrote since it does Seem Nicer to
actually use the PQexpbuffer code there, and the patch was rather
trivial, but it's certainly not something to backpatch then...
I also found at least one other place in libpq where it still does
fprintf(stderr). That should probably be fixed at the same time, right?
//Magnus
Magnus Hagander <magnus@hagander.net> writes:
Actually coding up a patch for that was just a bunch of simple
search/replace ops. Attached is one that appears to work fine for me.
Was there any reason why this wasn't done before, or just nobody had the
time? If there was a reason, please let me know what it was :-)
AFAIR nobody got round to it because it hadn't seemed important.
(Question about backpatch remains)
I'd vote against backpatching. The appropriate fix for back branches
is probably just to reduce the strncpy and snprintf arguments to
INITIAL_EXPBUFFER_SIZE, ie, make the code do what that header comment
says it should do.
Style point: in the places where you've chosen to pass the whole PGconn,
you should remove any separate arguments that are actually just PGconn
fields; eg for pg_krb5_sendauth it looks like sock and servicename are
now redundant. Otherwise there are risks of programmer confusion, and
maybe even wrong code generation, due to aliasing.
It would be more consistent to pass PGconn around to all of these
functions instead of trying to have them have just partial views of it,
but I dunno if you want to engage in purely cosmetic changes.
regards, tom lane
Magnus Hagander <magnus@hagander.net> writes:
I also found at least one other place in libpq where it still does
fprintf(stderr). That should probably be fixed at the same time, right?
Yeah, we should be using the error message buffer if at all possible.
regards, tom lane
Tom Lane wrote:
Magnus Hagander <magnus@hagander.net> writes:
Actually coding up a patch for that was just a bunch of simple
search/replace ops. Attached is one that appears to work fine for me.Was there any reason why this wasn't done before, or just nobody had the
time? If there was a reason, please let me know what it was :-)AFAIR nobody got round to it because it hadn't seemed important.
Ok. I actually managed to provoke a GSSAPI error that got cut off at 256
characters in testing. Which is kind of amazing in itself, but...
(Question about backpatch remains)
I'd vote against backpatching. The appropriate fix for back branches
is probably just to reduce the strncpy and snprintf arguments to
INITIAL_EXPBUFFER_SIZE, ie, make the code do what that header comment
says it should do.
Right. See other mail as well.
Style point: in the places where you've chosen to pass the whole PGconn,
you should remove any separate arguments that are actually just PGconn
fields; eg for pg_krb5_sendauth it looks like sock and servicename are
now redundant. Otherwise there are risks of programmer confusion, and
maybe even wrong code generation, due to aliasing.It would be more consistent to pass PGconn around to all of these
functions instead of trying to have them have just partial views of it,
but I dunno if you want to engage in purely cosmetic changes.
I'll go ahead and do that now, while I'm whacking the code around.
//Magnus
Magnus Hagander <magnus@hagander.net> writes:
I never actually tested if it crashes on mingw, but looking some more at it
it really should - once one of these errors happen.
Hm. Much easier than that - the code is new in HEAD. 8.2 did
fprintf(stderr). And HEAD still does that in at least one case.
Anyway, I'll go ahead with the patch I wrote since it does Seem Nicer to
actually use the PQexpbuffer code there, and the patch was rather
trivial, but it's certainly not something to backpatch then...
It does look like there is a risk in 8.2 and before, though:
the fe-auth.c code has a lot of snprintf's with PQERRORMSG_LENGTH,
which should all be INITIAL_EXPBUFFER_SIZE according to that header
comment. snprintf typically doesn't write more than it has to,
but if there ever were a message exceeding INITIAL_EXPBUFFER_SIZE
we'd be at risk of a memory clobber. So that should be changed
as far back as it does that. Do you want to take care of it?
I can if you don't want to.
regards, tom lane
Tom Lane wrote:
Magnus Hagander <magnus@hagander.net> writes:
I never actually tested if it crashes on mingw, but looking some more at it
it really should - once one of these errors happen.Hm. Much easier than that - the code is new in HEAD. 8.2 did
fprintf(stderr). And HEAD still does that in at least one case.Anyway, I'll go ahead with the patch I wrote since it does Seem Nicer to
actually use the PQexpbuffer code there, and the patch was rather
trivial, but it's certainly not something to backpatch then...It does look like there is a risk in 8.2 and before, though:
the fe-auth.c code has a lot of snprintf's with PQERRORMSG_LENGTH,
which should all be INITIAL_EXPBUFFER_SIZE according to that header
comment. snprintf typically doesn't write more than it has to,
but if there ever were a message exceeding INITIAL_EXPBUFFER_SIZE
we'd be at risk of a memory clobber. So that should be changed
as far back as it does that. Do you want to take care of it?
I can if you don't want to.
Oh, didn't realize that one.
I can take a look at that as well, once I'm done with this one. Seems
easy enough - I'll leave you to focus on the more difficult stuff :-)
//Magnus