C99 compliance for src/port/snprintf.c
I noticed that src/port/snprintf.c still follows the old (pre-C99)
semantics for what to return from snprintf() after buffer overrun.
This seems bad for a couple of reasons:
* It results in potential performance loss in psnprintf and
appendPQExpBufferVA, since those have to fly blind about how much
to enlarge the buffer before retrying.
* Given that we override the system-supplied *printf if it doesn't
support the 'z' width modifier, it seems highly unlikely that we are
still using any snprintf's with pre-C99 semantics, except when this
code is used. So there's not much point in keeping this behavior
as a test case to ensure compatibility with such libraries.
* When we install snprintf.c, port.h forces it to be used everywhere
that includes c.h, including extensions. It seems quite possible that
there is extension code out there that assumes C99 snprintf behavior.
Such code would work today everywhere except Windows, since that's the
only platform made in this century that requires snprintf.c. Between
the existing Windows porting hazard and our nearby discussion about
using snprintf.c on more platforms, I'd say this is a gotcha waiting
to bite somebody.
Hence, PFA a patch that changes snprintf.c to follow C99 by counting
dropped bytes in the result of snprintf(), including in the case where
the passed count is zero.
(I also dropped the tests for str == NULL when count isn't zero; that's
not permitted by either C99 or SUSv2, so I see no reason for this code
to support it. Also, avoid wasting one byte in the local buffer for
*fprintf.)
I'm almost tempted to think that the reasons above make this a
back-patchable bug fix. Comments?
regards, tom lane
Attachments:
c99-compliant-snprintf-result-1.patchtext/x-diff; charset=us-ascii; name=c99-compliant-snprintf-result-1.patchDownload+92-88
On Tue, Aug 14, 2018 at 7:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm almost tempted to think that the reasons above make this a
back-patchable bug fix. Comments?
No objections to changing the behavior. Have you checked whether
there are any noticeable performance consequences?
Back-patching seems a bit aggressive to me considering that the danger
is hypothetical. I'd want to have some tangible evidence that
back-patching was going help somebody. For all we know somebody's got
an extension which they only use on Windows that happens to be relying
on the current behavior, although more likely still (IMHO) is that it
there is little or no code relying on either behavior.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Aug 14, 2018 at 7:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm almost tempted to think that the reasons above make this a
back-patchable bug fix. Comments?
No objections to changing the behavior. Have you checked whether
there are any noticeable performance consequences?
Hard to believe there could be any performance loss. The normal (non
buffer overflowing) code path gets two additional instructions, storing
zero into .nchars and then adding it to the result. There would be more
work added to buffer-overflow cases, but we'd surely more than win that
back by avoiding repeated repalloc-and-try-again cycles.
Back-patching seems a bit aggressive to me considering that the danger
is hypothetical. I'd want to have some tangible evidence that
back-patching was going help somebody. For all we know somebody's got
an extension which they only use on Windows that happens to be relying
on the current behavior, although more likely still (IMHO) is that it
there is little or no code relying on either behavior.
Yeah, it's theoretically possible that there's somebody out there who's
depending on the pre-C99 behavior and never tested their code anywhere
but Windows. But come on, how likely is that really, compared to the
much more plausible risks I already cited? It's not even that easy
to imagine how someone would construct code that had such a dependency
while not being outright buggy in the face of buffer overrun.
BTW, independently of whether to back-patch, it strikes me that what
we ought to do in HEAD (after applying this) is to just assume we have
C99-compliant behavior, and rip out the baroque logic in psnprintf
and appendPQExpBufferVA that tries to deal with pre-C99 snprintf.
I don't expect that'd save any really meaningful number of cycles,
but at least it'd buy back the two added instructions mentioned above.
I suppose we could put in a configure check that verifies whether
the system snprintf returns the right value for overrun cases, though
it's hard to believe there are any platforms that pass the 'z' check
and would fail this one.
regards, tom lane
Hi,
On 2018-08-15 11:41:46 -0400, Tom Lane wrote:
BTW, independently of whether to back-patch, it strikes me that what
we ought to do in HEAD (after applying this) is to just assume we have
C99-compliant behavior, and rip out the baroque logic in psnprintf
and appendPQExpBufferVA that tries to deal with pre-C99 snprintf.
I don't expect that'd save any really meaningful number of cycles,
but at least it'd buy back the two added instructions mentioned above.
I suppose we could put in a configure check that verifies whether
the system snprintf returns the right value for overrun cases, though
it's hard to believe there are any platforms that pass the 'z' check
and would fail this one.
We could just mandate C99, more generally.
/me goes and hides in a bush.
On 2018-Aug-15, Robert Haas wrote:
On Tue, Aug 14, 2018 at 7:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm almost tempted to think that the reasons above make this a
back-patchable bug fix. Comments?No objections to changing the behavior. Have you checked whether
there are any noticeable performance consequences?Back-patching seems a bit aggressive to me considering that the danger
is hypothetical.
That was my first thought too, and my preferred path would be to make
this master-only and only consider a backpatch later if we find some
practical reason to do so.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Aug 15, 2018 at 11:52 AM, Andres Freund <andres@anarazel.de> wrote:
We could just mandate C99, more generally.
/me goes and hides in a bush.
It's hard to believe that would cost much. Personally, I'd prefer to
continue avoiding // comments and intermingled declarations of
variables and code on grounds of style and readability. But it's kind
of difficult to believe that we really need to worry about people
still running 20-year old compilers very much. My first exposure to
Tom arguing that we ought to continue supporting C89 was about a
decade ago, but the argument that people might still have older
systems in service was a lot more plausible then than it is now.
BTW, I think a bush is probably not a nearly sufficient place to hide.
The wrath of Tom will find you wherever you may go... :-)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Greetings,
* Andres Freund (andres@anarazel.de) wrote:
On 2018-08-15 11:41:46 -0400, Tom Lane wrote:
BTW, independently of whether to back-patch, it strikes me that what
we ought to do in HEAD (after applying this) is to just assume we have
C99-compliant behavior, and rip out the baroque logic in psnprintf
and appendPQExpBufferVA that tries to deal with pre-C99 snprintf.
I don't expect that'd save any really meaningful number of cycles,
but at least it'd buy back the two added instructions mentioned above.
I suppose we could put in a configure check that verifies whether
the system snprintf returns the right value for overrun cases, though
it's hard to believe there are any platforms that pass the 'z' check
and would fail this one.We could just mandate C99, more generally.
*cough* +1 *cough*
/me goes and hides in a bush.
/me runs
Thanks!
Stephen
Hi,
On 2018-08-15 12:01:28 -0400, Robert Haas wrote:
On Wed, Aug 15, 2018 at 11:52 AM, Andres Freund <andres@anarazel.de> wrote:
We could just mandate C99, more generally.
/me goes and hides in a bush.
It's hard to believe that would cost much.
Yea.
Personally, I'd prefer to continue avoiding // comments and
intermingled declarations of variables and code on grounds of style
and readability.
I don't really care much about either. The calculus for intermingled
declarations would personally change the minute that we decided to allow
some C++ - allowing for scoped locks etc via RAII - but not earlier.
The thing I'd really want is designated initializers for structs. Makes
code for statically allocated structs *so* much more readable. And guess
who's working on code that adds statically allocated structs with lots
of members...
BTW, I think a bush is probably not a nearly sufficient place to hide.
The wrath of Tom will find you wherever you may go... :-)
That's why I keep moving. At 200 mph on a train :P. A continent away.
Greetings,
Andres Freund
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Aug 15, 2018 at 11:52 AM, Andres Freund <andres@anarazel.de> wrote:
We could just mandate C99, more generally.
/me goes and hides in a bush.
It's hard to believe that would cost much.
I think we have done that, piece by piece, where it was actually buying us
something. In particular we've gradually moved the goalposts for *printf
compliance, and I'm proposing here to move them a bit further. I'm not
sure what "we're going to insist on C99" even means concretely, given
this position ...
Personally, I'd prefer to
continue avoiding // comments and intermingled declarations of
variables and code on grounds of style and readability.
... which I agree with.
But it's kind
of difficult to believe that we really need to worry about people
still running 20-year old compilers very much.
Sure. It's been a long time since anybody worried about those as
optimization targets, for instance. Still, I'm not in favor of
actively breaking compatibility unless it buys us something.
regards, tom lane
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
On 2018-Aug-15, Robert Haas wrote:
Back-patching seems a bit aggressive to me considering that the danger
is hypothetical.
That was my first thought too, and my preferred path would be to make
this master-only and only consider a backpatch later if we find some
practical reason to do so.
Meh --- the hazards of back-patching seem to me to be more hypothetical
than the benefits. Still, I seem to be in the minority, so I withdraw
the proposal to back-patch.
regards, tom lane
On 8/15/18 12:17 PM, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Personally, I'd prefer to
continue avoiding // comments and intermingled declarations of
variables and code on grounds of style and readability.... which I agree with.
We already have -Wdeclaration-after-statement to prevent mixed
declarations. Not sure what to do about comments except manual enforcement.
But it's kind
of difficult to believe that we really need to worry about people
still running 20-year old compilers very much.Sure. It's been a long time since anybody worried about those as
optimization targets, for instance. Still, I'm not in favor of
actively breaking compatibility unless it buys us something.
We use C99 for the pgBackRest project and we've found designated
initializers, compound declarations, and especially variadic macros to
be very helpful. Only the latter really provides new functionality but
simplifying and clarifying code is always a bonus.
So, +1 from me!
Regards,
--
-David
david@pgmasters.net
I wrote:
Meh --- the hazards of back-patching seem to me to be more hypothetical
than the benefits. Still, I seem to be in the minority, so I withdraw
the proposal to back-patch.
Actually, after digging around a bit, I'm excited about this again.
There are only a couple dozen places in our tree that pay any attention
to the result of (v)snprintf, but with the exception of psnprintf,
appendPQExpBufferVA, and one or two other places, *they're all assuming
C99 semantics*, and will fail to detect buffer overflow with the pre-C99
behavior.
Probably a lot of these are not live bugs because buffer overrun is
not ever going to occur in practice. But at least pg_upgrade and
pg_regress are constructing command strings including externally
supplied paths, so overrun doesn't seem impossible. If it happened,
they'd merrily proceed to execute a truncated command.
If we don't backpatch the snprintf change, we're morally obliged to
back-patch some other fix for these places. At least one of them,
in plperl's pport.h, is not our code and so changing it seems like
a bad idea.
Still want to argue for no backpatch?
regards, tom lane
PS: I also found a couple of places that are just wrong regardless
of semantics: they're checking overflow by "result > bufsize", not
"result >= bufsize". Will fix those in any case.
David Steele <david@pgmasters.net> writes:
On 8/15/18 12:17 PM, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Personally, I'd prefer to
continue avoiding // comments and intermingled declarations of
variables and code on grounds of style and readability.
... which I agree with.
We already have -Wdeclaration-after-statement to prevent mixed
declarations. Not sure what to do about comments except manual enforcement.
pgindent will change // comments to /* style, so at least on the timescale
of a release cycle, we have enforcement for that.
regards, tom lane
On 08/15/2018 12:17 PM, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Personally, I'd prefer to
continue avoiding // comments and intermingled declarations of
variables and code on grounds of style and readability.... which I agree with.
A decade or so ago I would have strongly agreed with you. But the
language trend seems to be in the other direction. And there is
something to be said for declaration near use without having to use an
inner block. I'm not advocating that we change policy, however.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
On 08/15/2018 12:17 PM, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Personally, I'd prefer to
continue avoiding // comments and intermingled declarations of
variables and code on grounds of style and readability.
... which I agree with.
A decade or so ago I would have strongly agreed with you. But the
language trend seems to be in the other direction. And there is
something to be said for declaration near use without having to use an
inner block. I'm not advocating that we change policy, however.
FWIW, the issue I've got with what C99 did is that you can narrow the
*start* of the scope of a local variable easily, but not the *end* of
its scope, which seems to me to be solving at most half of the problem.
To solve the whole problem, you end up needing a nested block anyway.
I do dearly miss the ability to easily limit the scope of a loop's
control variable to just the loop, eg
for (int i = 0; ...) { ... }
But AFAIK that's C++ not C99.
regards, tom lane
On 08/15/2018 03:18 PM, Tom Lane wrote:
FWIW, the issue I've got with what C99 did is that you can narrow the
*start* of the scope of a local variable easily, but not the *end* of
its scope, which seems to me to be solving at most half of the problem.
To solve the whole problem, you end up needing a nested block anyway.I do dearly miss the ability to easily limit the scope of a loop's
control variable to just the loop, egfor (int i = 0; ...) { ... }
But AFAIK that's C++ not C99.
Agree completely.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 8/15/18 3:18 PM, Tom Lane wrote:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
On 08/15/2018 12:17 PM, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Personally, I'd prefer to
continue avoiding // comments and intermingled declarations of
variables and code on grounds of style and readability.... which I agree with.
A decade or so ago I would have strongly agreed with you. But the
language trend seems to be in the other direction. And there is
something to be said for declaration near use without having to use an
inner block. I'm not advocating that we change policy, however.FWIW, the issue I've got with what C99 did is that you can narrow the
*start* of the scope of a local variable easily, but not the *end* of
its scope, which seems to me to be solving at most half of the problem.
To solve the whole problem, you end up needing a nested block anyway.I do dearly miss the ability to easily limit the scope of a loop's
control variable to just the loop, egfor (int i = 0; ...) { ... }
But AFAIK that's C++ not C99.
This works in C99 -- and I'm a really big fan.
--
-David
david@pgmasters.net
David Steele <david@pgmasters.net> writes:
On 8/15/18 3:18 PM, Tom Lane wrote:
I do dearly miss the ability to easily limit the scope of a loop's
control variable to just the loop, eg
for (int i = 0; ...) { ... }
But AFAIK that's C++ not C99.
This works in C99 -- and I'm a really big fan.
It does? [ checks standard... ] Oh wow:
6.8.5 Iteration statements
Syntax
iteration-statement:
while ( expression ) statement
do statement while ( expression ) ;
for ( expr-opt ; expr-opt ; expr-opt ) statement
for ( declaration ; expr-opt ; expr-opt ) statement
I'd always thought this was only in C++. This alone might be a sufficient
reason to drop C89 compiler support ...
regards, tom lane
Hi,
On 2018-08-15 15:57:43 -0400, Tom Lane wrote:
I'd always thought this was only in C++. This alone might be a sufficient
reason to drop C89 compiler support ...
It's also IIRC reasonably widely supported from before C99. So, for the
sake of designated initializers, for loop scoping, snprintf, let's do
this in master?
Greetings,
Andres Freund
On 2018-08-15 14:05:29 -0400, Tom Lane wrote:
I wrote:
Meh --- the hazards of back-patching seem to me to be more hypothetical
than the benefits. Still, I seem to be in the minority, so I withdraw
the proposal to back-patch.Actually, after digging around a bit, I'm excited about this again.
There are only a couple dozen places in our tree that pay any attention
to the result of (v)snprintf, but with the exception of psnprintf,
appendPQExpBufferVA, and one or two other places, *they're all assuming
C99 semantics*, and will fail to detect buffer overflow with the pre-C99
behavior.Probably a lot of these are not live bugs because buffer overrun is
not ever going to occur in practice. But at least pg_upgrade and
pg_regress are constructing command strings including externally
supplied paths, so overrun doesn't seem impossible. If it happened,
they'd merrily proceed to execute a truncated command.If we don't backpatch the snprintf change, we're morally obliged to
back-patch some other fix for these places. At least one of them,
in plperl's pport.h, is not our code and so changing it seems like
a bad idea.Still want to argue for no backpatch?
regards, tom lane
PS: I also found a couple of places that are just wrong regardless
of semantics: they're checking overflow by "result > bufsize", not
"result >= bufsize". Will fix those in any case.
I'm a bit confused. Why did you just backpatch this ~two hours after
people objected to the idea? Even if it were during my current work
hours, I don't even read mail that often if I'm hacking on something
complicated.
Greetings,
Andres Freund