strncpy is not a safe version of strcpy
Hi All,
As a bit of a background task, over the past few days I've been analysing
the uses of strncpy in the code just to try and validate if it is the right
function to be using. I've already seen quite a few places where their
usage is wrongly assumed.
As many of you will know and maybe some of you have forgotten that strncpy
is not a safe version of strcpy. It is also quite an inefficient way to
copy a string to another buffer as strncpy will 0 out any space that
happens to remain in the buffer. If there is no space left after the copy
then the buffer won't end with a 0.
It is likely far better explained here -->
http://www.courtesan.com/todd/papers/strlcpy.html
For example , the following 2 lines in jsonfuncs.c
memset(name, 0, NAMEDATALEN);
strncpy(name, fname, NAMEDATALEN);
The memset here is redundant as strncpy will null the remaining buffer.
This example is not dangerous, but it does highlight that there's code
that's made the final cut which made this wrong assumption about strncpy.
I was not going to bring this to light until I had done some more analysis,
but there was just a commit which added a usage of strncpy that really
looks like it should be a strlcpy.
I'll continue with my analysis, but perhaps posting this early will bring
something to light which I've not yet realised.
Regards
David Rowley
On 15 Listopad 2013, 0:07, David Rowley wrote:
Hi All,
As a bit of a background task, over the past few days I've been analysing
the uses of strncpy in the code just to try and validate if it is the
right
function to be using. I've already seen quite a few places where their
usage is wrongly assumed.As many of you will know and maybe some of you have forgotten that strncpy
is not a safe version of strcpy. It is also quite an inefficient way to
copy a string to another buffer as strncpy will 0 out any space that
happens to remain in the buffer. If there is no space left after the copy
then the buffer won't end with a 0.It is likely far better explained here -->
http://www.courtesan.com/todd/papers/strlcpy.htmlFor example , the following 2 lines in jsonfuncs.c
memset(name, 0, NAMEDATALEN);
strncpy(name, fname, NAMEDATALEN);
Be careful with 'Name' data type - it's not just a simple string buffer.
AFAIK it needs to work with hashing etc. so the zeroing is actually needed
here to make sure two values produce the same result. At least that's how
I understand the code after a quick check - for example this is from the
same jsonfuncs.c you mentioned:
memset(fname, 0, NAMEDATALEN);
strncpy(fname, NameStr(tupdesc->attrs[i]->attname), NAMEDATALEN);
hashentry = hash_search(json_hash, fname, HASH_FIND, NULL);
So the zeroing is on purpose, although if strncpy does that then the
memset is probably superflous. Either people do that because of habit /
copy'n'paste, or maybe there are supported platforms when strncpy does not
behave like this for some reason.
I seriously doubt this inefficiency is going to be measurable in real
world. If the result was a buffer-overflow bug, that'd be a different
story, but maybe we could check the ~120 calls to strncpy in the whole
code base and replace it with strlcpy where appropriate.
That being said, thanks for looking into things like this.
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 15, 2013 at 12:33 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
It is likely far better explained here -->
http://www.courtesan.com/todd/papers/strlcpy.htmlFor example , the following 2 lines in jsonfuncs.c
memset(name, 0, NAMEDATALEN);
strncpy(name, fname, NAMEDATALEN);Be careful with 'Name' data type - it's not just a simple string buffer.
AFAIK it needs to work with hashing etc. so the zeroing is actually needed
here to make sure two values produce the same result. At least that's how
I understand the code after a quick check - for example this is from the
same jsonfuncs.c you mentioned:memset(fname, 0, NAMEDATALEN);
strncpy(fname, NameStr(tupdesc->attrs[i]->attname), NAMEDATALEN);
hashentry = hash_search(json_hash, fname, HASH_FIND, NULL);So the zeroing is on purpose, although if strncpy does that then the
memset is probably superflous. Either people do that because of habit /
copy'n'paste, or maybe there are supported platforms when strncpy does not
behave like this for some reason.
I had not thought of the fact the some platforms don't properly implement
strncpy(). On quick check http://man.he.net/man3/strncpy seems to indicate
that this behaviour is part of the C89 standard. So does this mean we can
always assume that all supported platforms always 0 out the remaining
buffer?
I seriously doubt this inefficiency is going to be measurable in real
world. If the result was a buffer-overflow bug, that'd be a different
story, but maybe we could check the ~120 calls to strncpy in the whole
code base and replace it with strlcpy where appropriate.
The example was more of a demonstration of wrong assumption rather than
wasted cycles. Though the wasted cycles was on my mind a bit too. I was
more focused on trying to draw a bit of attention to commit
061b88c732952c59741374806e1e41c1ec845d50 which uses strncpy and does not
properly set the last byte to 0 afterwards. I think this case could just be
replaced with strlcpy which does all this hard work for us.
Regards
David Rowley
Show quoted text
That being said, thanks for looking into things like this.
Tomas
On 15 Listopad 2013, 1:00, David Rowley wrote:
On Fri, Nov 15, 2013 at 12:33 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
It is likely far better explained here -->
http://www.courtesan.com/todd/papers/strlcpy.htmlFor example , the following 2 lines in jsonfuncs.c
memset(name, 0, NAMEDATALEN);
strncpy(name, fname, NAMEDATALEN);Be careful with 'Name' data type - it's not just a simple string buffer.
AFAIK it needs to work with hashing etc. so the zeroing is actually
needed
here to make sure two values produce the same result. At least that's
how
I understand the code after a quick check - for example this is from the
same jsonfuncs.c you mentioned:memset(fname, 0, NAMEDATALEN);
strncpy(fname, NameStr(tupdesc->attrs[i]->attname), NAMEDATALEN);
hashentry = hash_search(json_hash, fname, HASH_FIND, NULL);So the zeroing is on purpose, although if strncpy does that then the
memset is probably superflous. Either people do that because of habit /
copy'n'paste, or maybe there are supported platforms when strncpy does
not
behave like this for some reason.I had not thought of the fact the some platforms don't properly implement
strncpy(). On quick check http://man.he.net/man3/strncpy seems to indicate
that this behaviour is part of the C89 standard. So does this mean we can
always assume that all supported platforms always 0 out the remaining
buffer?
I don't know about such platform - I was merely speculating about why
people might use such code.
I seriously doubt this inefficiency is going to be measurable in real
world. If the result was a buffer-overflow bug, that'd be a different
story, but maybe we could check the ~120 calls to strncpy in the whole
code base and replace it with strlcpy where appropriate.The example was more of a demonstration of wrong assumption rather than
wasted cycles. Though the wasted cycles was on my mind a bit too. I was
Yeah. To be fair, number of occurrences in the code base is not a
particularly exact measure of the impact - some of those uses might be
used in code paths that are quite busy.
more focused on trying to draw a bit of attention to commit
061b88c732952c59741374806e1e41c1ec845d50 which uses strncpy and does not
properly set the last byte to 0 afterwards. I think this case could just
be
replaced with strlcpy which does all this hard work for us.
Hmm, you mean this piece of code?
strncpy(saved_argv0, argv[0], MAXPGPATH);
IMHO you're right that's probably broken, unless there's some checking
happening before the call.
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Tomas Vondra (tv@fuzzy.cz) wrote:
On 15 Listopad 2013, 1:00, David Rowley wrote:
more focused on trying to draw a bit of attention to commit
061b88c732952c59741374806e1e41c1ec845d50 which uses strncpy and does not
properly set the last byte to 0 afterwards. I think this case could just
be
replaced with strlcpy which does all this hard work for us.Hmm, you mean this piece of code?
strncpy(saved_argv0, argv[0], MAXPGPATH);
IMHO you're right that's probably broken, unless there's some checking
happening before the call.
Agreed, that looks like a place we should be using strlcpy() instead.
Robert, what do you think?
Thanks,
Stephen
Tomas Vondra <tv@fuzzy.cz> wrote:
On 15 Listopad 2013, 1:00, David Rowley wrote:
more focused on trying to draw a bit of attention to commit
061b88c732952c59741374806e1e41c1ec845d50 which uses strncpy and
does not properly set the last byte to 0 afterwards. I think
this case could just be replaced with strlcpy which does all
this hard work for us.Hmm, you mean this piece of code?
strncpy(saved_argv0, argv[0], MAXPGPATH);
IMHO you're right that's probably broken, unless there's some
checking happening before the call.
I agree, and there is no such checking. Fix pushed.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-11-15 09:24:59 -0500, Stephen Frost wrote:
* Tomas Vondra (tv@fuzzy.cz) wrote:
On 15 Listopad 2013, 1:00, David Rowley wrote:
more focused on trying to draw a bit of attention to commit
061b88c732952c59741374806e1e41c1ec845d50 which uses strncpy and does not
properly set the last byte to 0 afterwards. I think this case could just
be
replaced with strlcpy which does all this hard work for us.Hmm, you mean this piece of code?
strncpy(saved_argv0, argv[0], MAXPGPATH);
IMHO you're right that's probably broken, unless there's some checking
happening before the call.Agreed, that looks like a place we should be using strlcpy() instead.
I don't mind fixing it, but I think anything but s/strncpy/strlcpy/ is
over the top. Translating such strings is just a waste of translator's
time.
If you really worry about paths being longer than MAXPGPATH, there's
lots, and lots of things to do that are, far, far more critical than
this.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-11-15 04:21:50 +0100, Tomas Vondra wrote:
Hmm, you mean this piece of code?
strncpy(saved_argv0, argv[0], MAXPGPATH);
IMHO you're right that's probably broken, unless there's some checking
happening before the call.
FWIW, argv0 is pretty much guaranteed to be shorter than MAXPGPATH since
MAXPGPATH is the longest a path can be, and argv[0] is either the executable's
name (if executed via PATH) or the path to the executable.
Now, you could probably write a program to exeve() a binary with argv[0]
being longer, but in that case you can also just put garbage in there.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Andres Freund (andres@2ndquadrant.com) wrote:
FWIW, argv0 is pretty much guaranteed to be shorter than MAXPGPATH since
MAXPGPATH is the longest a path can be, and argv[0] is either the executable's
name (if executed via PATH) or the path to the executable.
Err, it's the longest that *we* think the path can be.. That's not the
same as actually being the longest that a path can be, which depends on
the filesystem and OS... It's not hard to get past our 1024 limit:
sfrost@beorn:/really/long/path> echo $PWD | wc -c
1409
Now, you could probably write a program to exeve() a binary with argv[0]
being longer, but in that case you can also just put garbage in there.
We shouldn't blow up in that case either, really.
Thanks,
Stephen
On 2013-11-15 09:53:24 -0500, Stephen Frost wrote:
* Andres Freund (andres@2ndquadrant.com) wrote:
FWIW, argv0 is pretty much guaranteed to be shorter than MAXPGPATH since
MAXPGPATH is the longest a path can be, and argv[0] is either the executable's
name (if executed via PATH) or the path to the executable.Err, it's the longest that *we* think the path can be.. That's not the
same as actually being the longest that a path can be, which depends on
the filesystem and OS... It's not hard to get past our 1024 limit:
Sure, there can be longer paths, but postgres don't support them. In a
*myriad* of places. It's just not worth spending code on it.
Just about any of the places that use MAXPGPATH are "vulnerable" or
produce confusing error messages if it's exceeded. And there are about
zero complaints about it.
Now, you could probably write a program to exeve() a binary with argv[0]
being longer, but in that case you can also just put garbage in there.We shouldn't blow up in that case either, really.
Good luck.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-11-15 10:04:12 -0500, Stephen Frost wrote:
* Andres Freund (andres@2ndquadrant.com) wrote:
Sure, there can be longer paths, but postgres don't support them. In a
*myriad* of places. It's just not worth spending code on it.Just about any of the places that use MAXPGPATH are "vulnerable" or
produce confusing error messages if it's exceeded. And there are about
zero complaints about it.Confusing error messages are one thing, segfaulting is another.
I didn't argue against s/strncpy/strlcpy/. That's clearly a sensible
fix.
I am arguing about introducing additional code and error messages about
it, that need to be translated. And starting doing so in isolationtester
of all places.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: 20131115150412.GC17272@tamriel.snowman.net
* Andres Freund (andres@2ndquadrant.com) wrote:
Sure, there can be longer paths, but postgres don't support them. In a
*myriad* of places. It's just not worth spending code on it.Just about any of the places that use MAXPGPATH are "vulnerable" or
produce confusing error messages if it's exceeded. And there are about
zero complaints about it.
Confusing error messages are one thing, segfaulting is another.
Thanks,
Stephen
David Rowley escribi�:
On Fri, Nov 15, 2013 at 12:33 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
Be careful with 'Name' data type - it's not just a simple string buffer.
AFAIK it needs to work with hashing etc. so the zeroing is actually needed
here to make sure two values produce the same result. At least that's how
I understand the code after a quick check - for example this is from the
same jsonfuncs.c you mentioned:memset(fname, 0, NAMEDATALEN);
strncpy(fname, NameStr(tupdesc->attrs[i]->attname), NAMEDATALEN);
hashentry = hash_search(json_hash, fname, HASH_FIND, NULL);So the zeroing is on purpose, although if strncpy does that then the
memset is probably superflous.
This code should probably be using namecpy(). Note namecpy() doesn't
memset() after strncpy() and has survived the test of time, which
strongly suggests that the memset is indeed superfluous.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
This code should probably be using namecpy(). Note namecpy()
doesn't memset() after strncpy() and has survived the test of
time, which strongly suggests that the memset is indeed
superfluous.
That argument would be more persuasive if I could find any current
usage of the namecpy() function anywhere in the source code.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
I didn't argue against s/strncpy/strlcpy/. That's clearly a sensible
fix.
I am arguing about introducing additional code and error messages about
it, that need to be translated. And starting doing so in isolationtester
of all places.
I agree with Andres on this. Commit
7cb964acb794078ef033cbf2e3a0e7670c8992a9 is the very definition of
overkill, and I don't want to see us starting to plaster the source
code with things like this. Converting strncpy to strlcpy seems
appropriate --- and sufficient.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Kevin Grittner escribi�:
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
This code should probably be using namecpy().� Note namecpy()
doesn't memset() after strncpy() and has survived the test of
time, which strongly suggests that the memset is indeed
superfluous.That argument would be more persuasive if I could find any current
usage of the namecpy() function anywhere in the source code.
Well, its cousin namestrcpy is used in a lot of places. That one uses a
regular C string as source; namecpy uses a Name as source, so they are
slightly different but the coding is pretty much the same.
There is a difference in using the macro StrNCpy instead of the strncpy
library function directly. ISTM this makes sense because Name is known
to be zero-terminated at NAMEDATALEN, which a random C string is not.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
Andres Freund <andres@2ndquadrant.com> writes:
I didn't argue against s/strncpy/strlcpy/. That's clearly a sensible
fix.
I am arguing about introducing additional code and error messages about
it, that need to be translated. And starting doing so in isolationtester
of all places.I agree with Andres on this. Commit
7cb964acb794078ef033cbf2e3a0e7670c8992a9 is the very definition of
overkill, and I don't want to see us starting to plaster the source
code with things like this. Converting strncpy to strlcpy seems
appropriate --- and sufficient.
Personally, I'd like to see better handling like this- but done in a way
which minimizes impact to code and translators. A function like
namecpy() (which I agree with Kevin about- curious that it's not used..)
which handled the check, errmsg and exit seems reasonable to me, for the
"userland" binaries (and perhaps the postmaster when doing command-line
checking of, eg, -D) that need it.
Still, I'm not offering to go do it, so take my feelings on it with that
in mind. :)
Thanks,
Stephen
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Kevin Grittner escribió:
That argument would be more persuasive if I could find any current
usage of the namecpy() function anywhere in the source code.Well, its cousin namestrcpy is used in a lot of places. That one uses a
regular C string as source; namecpy uses a Name as source, so they are
slightly different but the coding is pretty much the same.
Fair enough.
There is a difference in using the macro StrNCpy instead of the strncpy
library function directly. ISTM this makes sense because Name is known
to be zero-terminated at NAMEDATALEN, which a random C string is not.
Is the capital T in the second #undef in this pg_locale.c code intended?:
#ifdef WIN32
/*
* This Windows file defines StrNCpy. We don't need it here, so we undefine
* it to keep the compiler quiet, and undefine it again after the file is
* included, so we don't accidentally use theirs.
*/
#undef StrNCpy
#include <shlwapi.h>
#ifdef StrNCpy
#undef STrNCpy
#endif
#endif
--
Kevin GrittnerEDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Nov 16, 2013 at 4:09 AM, Alvaro Herrera <alvherre@2ndquadrant.com>wrote:
David Rowley escribió:
On Fri, Nov 15, 2013 at 12:33 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
Be careful with 'Name' data type - it's not just a simple string
buffer.
AFAIK it needs to work with hashing etc. so the zeroing is actually
needed
here to make sure two values produce the same result. At least that's
how
I understand the code after a quick check - for example this is from
the
same jsonfuncs.c you mentioned:
memset(fname, 0, NAMEDATALEN);
strncpy(fname, NameStr(tupdesc->attrs[i]->attname), NAMEDATALEN);
hashentry = hash_search(json_hash, fname, HASH_FIND, NULL);So the zeroing is on purpose, although if strncpy does that then the
memset is probably superflous.This code should probably be using namecpy(). Note namecpy() doesn't
memset() after strncpy() and has survived the test of time, which
strongly suggests that the memset is indeed superfluous.
I went on a bit of a strncpy cleanup rampage this morning and ended up
finding quite a few places where strncpy is used wrongly.
I'm not quite sure if I have got them all in this patch, but I' think I've
got the obvious ones at least.
For the hash_search in jsconfuncs.c after thinking about it a bit more...
Can we not just pass the attname without making a copy of it? I see keyPtr
in hash_search is const void * so it shouldn't get modified in there. I
can't quite see the reason for making the copy.
Attached is a patch with various cleanups where I didn't like the look of
the strncpy. I didn't go overboard with this as I know making this sort of
small changes all over can be a bit scary and I thought maybe it would get
rejected on that basis.
I also cleaned up things like strncpy(dest, src, strlen(src)); which just
seems a bit weird and I'm failing to get my head around why it was done. I
replaced these with memcpy instead, but they could perhaps be a plain old
strcpy.
Regards
David Rowley
Show quoted text
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
strncpy_cleanup_v0.1.patchapplication/octet-stream; name=strncpy_cleanup_v0.1.patchDownload+34-38
On Sat, Nov 16, 2013 at 12:53:10PM +1300, David Rowley wrote:
I went on a bit of a strncpy cleanup rampage this morning and ended up
finding quite a few places where strncpy is used wrongly.
I'm not quite sure if I have got them all in this patch, but I' think I've
got the obvious ones at least.For the hash_search in jsconfuncs.c after thinking about it a bit more...
Can we not just pass the attname without making a copy of it? I see keyPtr
in hash_search is const void * so it shouldn't get modified in there. I
can't quite see the reason for making the copy.
+1 for the goal of this patch. Another commit took care of your jsonfuncs.c
concerns, and the patch for CVE-2014-0065 fixed several of the others. Plenty
remain, though.
Attached is a patch with various cleanups where I didn't like the look of
the strncpy. I didn't go overboard with this as I know making this sort of
small changes all over can be a bit scary and I thought maybe it would get
rejected on that basis.I also cleaned up things like strncpy(dest, src, strlen(src)); which just
seems a bit weird and I'm failing to get my head around why it was done. I
replaced these with memcpy instead, but they could perhaps be a plain old
strcpy.
I suggest preparing one or more patches that focus on the cosmetic-only
changes, such as strncpy() -> memcpy() when strncpy() is guaranteed not to
reach a NUL byte. With that noise out of the way, it will be easier to give
the rest the attention it deserves.
Thanks,
nm
--
Noah Misch
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers