Why the asprintf patch is still breaking the buildfarm
So I returned from vacation only to find that the buildfarm has a bad case
of acne. All the Windows members are red or pink, and have been for
awhile. Sigh.
After some research I believe that I understand the reason for the CHECK
failures, at least:
1. src/port/asprintf.c exhibits a truly touching faith that vsnprintf will
report exactly the number of bytes that would have been required, even if
the buffer is not that large. While this is what is specified in recent
versions of the POSIX standard, older platforms have much sketchier
behavior.
2. In particular, our own src/port/snprintf.c follows the SUS v2 rule that
it should report the number of bytes it *actually wrote*. This means
that asprintf.c will never think that its initial 128-byte allocation was
insufficient. So, on platforms where we use this implementation (notably
including Windows), the result of any asprintf call is effectively
truncated at 128 bytes.
3. I believe the exact cause of the reported failures is that the
add_to_path calls in pg_regress.c result in truncating the value of the
PATH environment value, causing system() to not find the "perl"
executable. (jacana is probably failing because of truncation of
LD_LIBRARY_PATH instead, which is unsurprising since the problem would
move around depending on the directory path lengths in use.)
IMO src/port/asprintf.c is hopelessly naive, as well as ugly and
undocumented. We should throw it away and replace it with an
implementation more like stringinfo.c's appendStringInfo, which is code
that has been through the wars and is known to be pretty bulletproof these
days. Aside from the immediate problem, that would allow us to get rid of
the unportable va_copy calls. (I say they're unportable because no such
functionality is specified in SUS v2. And no, I do not have any faith at
all in commit c2316dcda1cd057d7d4a56e3a51e3f8f0527e906 as a workaround.)
I have a lot of other gripes about this whole patch, but they can
wait till tomorrow.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Oct 22, 2013 at 8:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
So I returned from vacation only to find that the buildfarm has a bad case
of acne. All the Windows members are red or pink, and have been for
awhile. Sigh.After some research I believe that I understand the reason for the CHECK
failures, at least:1. src/port/asprintf.c exhibits a truly touching faith that vsnprintf will
report exactly the number of bytes that would have been required, even if
the buffer is not that large. While this is what is specified in recent
versions of the POSIX standard, older platforms have much sketchier
behavior.2. In particular, our own src/port/snprintf.c follows the SUS v2 rule that
it should report the number of bytes it *actually wrote*. This means
that asprintf.c will never think that its initial 128-byte allocation was
insufficient. So, on platforms where we use this implementation (notably
including Windows), the result of any asprintf call is effectively
truncated at 128 bytes.
Thanks for looking at this. I had a bash and trying to figure out why
vcregress check would not work last night and didn't get very far...
I can confirm that you are right just by changing the 128 into 12800 and
compiling, vcregress check worked after that.
Regards
David Rowley
Show quoted text
I have a lot of other gripes about this whole patch, but they can
wait till tomorrow.regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 22/10/2013 09:58, Tom Lane wrote:
So I returned from vacation only to find that the buildfarm has a bad case
of acne. All the Windows members are red or pink, and have been for
awhile. Sigh.After some research I believe that I understand the reason for the CHECK
failures, at least:1. src/port/asprintf.c exhibits a truly touching faith that vsnprintf will
report exactly the number of bytes that would have been required, even if
the buffer is not that large. While this is what is specified in recent
versions of the POSIX standard, older platforms have much sketchier
behavior.
Just to be pedantic, this is required by C99.
[...]
Regards Manlio Perillo
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Manlio Perillo <manlio.perillo@gmail.com> writes:
On 22/10/2013 09:58, Tom Lane wrote:
1. src/port/asprintf.c exhibits a truly touching faith that vsnprintf will
report exactly the number of bytes that would have been required, even if
the buffer is not that large. While this is what is specified in recent
versions of the POSIX standard, older platforms have much sketchier
behavior.
Just to be pedantic, this is required by C99.
Yeah. As a separate matter, it might be useful to revise stringinfo.c
and the asprintf code so that *if* the returned value is larger than the
given buffer size, we use it as a guide to resizing, avoiding the possible
need to loop multiple times to make the buffer large enough. And we could
also improve our own implementation of snprintf to follow the C99 spec.
The point here is that we still need to cope with pre-C99 implementations
that might return -1 or the given buffer size on overflow. The NetBSD
implementation doesn't do that, which is reasonable in their context, but
not workable for us.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Oct 22, 2013 at 11:00:42AM -0400, Tom Lane wrote:
Yeah. As a separate matter, it might be useful to revise stringinfo.c
and the asprintf code so that *if* the returned value is larger than the
given buffer size, we use it as a guide to resizing, avoiding the possible
need to loop multiple times to make the buffer large enough. And we could
also improve our own implementation of snprintf to follow the C99 spec.The point here is that we still need to cope with pre-C99 implementations
that might return -1 or the given buffer size on overflow. The NetBSD
implementation doesn't do that, which is reasonable in their context, but
not workable for us.
I would vote for choosing the standard we want vsnprintf() to follow (probably
C99) and substituting a conforming implementation wherever "configure" detects
that libc does not conform. We'll be shipping some replacement vsnprintf() in
any case; we may as well use it to insulate the rest of our code from
less-preferred variants.
--
Noah Misch
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Noah Misch <noah@leadboat.com> writes:
On Tue, Oct 22, 2013 at 11:00:42AM -0400, Tom Lane wrote:
Yeah. As a separate matter, it might be useful to revise stringinfo.c
and the asprintf code so that *if* the returned value is larger than the
given buffer size, we use it as a guide to resizing, avoiding the possible
need to loop multiple times to make the buffer large enough. And we could
also improve our own implementation of snprintf to follow the C99 spec.The point here is that we still need to cope with pre-C99 implementations
that might return -1 or the given buffer size on overflow. The NetBSD
implementation doesn't do that, which is reasonable in their context, but
not workable for us.
I would vote for choosing the standard we want vsnprintf() to follow (probably
C99) and substituting a conforming implementation wherever "configure" detects
that libc does not conform. We'll be shipping some replacement vsnprintf() in
any case; we may as well use it to insulate the rest of our code from
less-preferred variants.
The problem is that we can't tell whether vsnprintf is standard-conforming
without a run-time test. That's bad for cross-compiled builds, and it's
pretty hazardous even for normal cases, since conceivably an executable
built on one machine could be used on another one with different run-time
behavior. I'd be willing to take those risks if we got a significant
benefit from it, but in this case I don't see much advantage to be had.
The code in stringinfo/psprintf wouldn't get very much simpler if we
assumed C99 behavior, and we've pretty well isolated the number of places
that care to those. (I see a couple places in pg_dump that could be
modified to use psprintf instead of direct vsnprintf calls; will go fix.)
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/23/2013 03:05 AM, Noah Misch wrote:
I would vote for choosing the standard we want vsnprintf() to follow (probably
C99) and substituting a conforming implementation wherever "configure" detects
that libc does not conform. We'll be shipping some replacement vsnprintf() in
any case; we may as well use it to insulate the rest of our code from
less-preferred variants.
Do you care about the snprintf behavior on very large buffers (larger
than INT_MAX)? Then there's further complication, and it's an area
where glibc behavior is likely to change in the future (because it is
claimed that C99 and POSIX conflict, and glibc implements neither behavior).
--
Florian Weimer / Red Hat Product Security Team
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Oct 23, 2013 at 4:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah. As a separate matter, it might be useful to revise stringinfo.c
and the asprintf code so that *if* the returned value is larger than the
given buffer size, we use it as a guide to resizing, avoiding the possible
need to loop multiple times to make the buffer large enough. And we could
also improve our own implementation of snprintf to follow the C99 spec.
Attached is a draft patch which implements this.
I didn't bother making the translation macros make use of the extra data as
I
didn't think we would have many translations long enough to take advantage
of it.
I think it's a good idea to take advantage of the buffer size if
vsnprintf() has gone
to the trouble of working out what is needed for us. It seems quite
wasteful to throw this information away.
Comments are welcome.
Regards
David
Attachments:
appendStringInfoVA.patchapplication/octet-stream; name=appendStringInfoVA.patchDownload
diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c
index a5b9f2e..2de6bbf 100644
--- a/src/backend/lib/stringinfo.c
+++ b/src/backend/lib/stringinfo.c
@@ -80,18 +80,28 @@ appendStringInfo(StringInfo str, const char *fmt,...)
for (;;)
{
va_list args;
- bool success;
+ int needed;
/* Try to format the data. */
va_start(args, fmt);
- success = appendStringInfoVA(str, fmt, args);
+ needed = appendStringInfoVA(str, fmt, args);
va_end(args);
- if (success)
+ if (!needed)
break;
- /* Double the buffer size and try again. */
- enlargeStringInfo(str, str->maxlen);
+ /* maintain growth in multiples of 2 */
+ if (needed > str->maxlen)
+ {
+ int newsize = str->maxlen;
+ while (newsize < needed)
+ newsize *= 2;
+ needed = newsize;
+ }
+ else
+ needed = str->maxlen;
+
+ enlargeStringInfo(str, needed);
}
}
@@ -100,16 +110,20 @@ appendStringInfo(StringInfo str, const char *fmt,...)
*
* Attempt to format text data under the control of fmt (an sprintf-style
* format string) and append it to whatever is already in str. If successful
- * return true; if not (because there's not enough space), return false
- * without modifying str. Typically the caller would enlarge str and retry
- * on false return --- see appendStringInfo for standard usage pattern.
- *
+ * return 0; if not (because there's not enough space), return a positive
+ * number. This number is the return value of the call to vsnprintf.
+ * Implementations of vsnprintf vary and this value may NOT be the length
+ * required for the whole string, so callers should take this value with a
+ * pinch of salt and treat the value returned as a hint that the space
+ * required might be bigger than they would think.
+ * --- see appendStringInfo for standard usage pattern.
+ *
* XXX This API is ugly, but there seems no alternative given the C spec's
* restrictions on what can portably be done with va_list arguments: you have
* to redo va_start before you can rescan the argument list, and we can't do
* that from here.
*/
-bool
+int
appendStringInfoVA(StringInfo str, const char *fmt, va_list args)
{
int avail,
@@ -123,7 +137,7 @@ appendStringInfoVA(StringInfo str, const char *fmt, va_list args)
*/
avail = str->maxlen - str->len - 1;
if (avail < 16)
- return false;
+ return 1; /* 1 will do as we don't know how much space is needed */
/*
* Assert check here is to catch buggy vsnprintf that overruns the
@@ -147,12 +161,12 @@ appendStringInfoVA(StringInfo str, const char *fmt, va_list args)
{
/* Success. Note nprinted does not include trailing null. */
str->len += nprinted;
- return true;
+ return 0;
}
/* Restore the trailing null so that str is unmodified. */
str->data[str->len] = '\0';
- return false;
+ return nprinted < 0 ? 1 : nprinted + 1;
}
/*
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 9c7489a..319af01 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -714,12 +714,12 @@ errcode_for_socket_access(void)
/* Generate actual output --- have to use appendStringInfoVA */ \
for (;;) \
{ \
- va_list args; \
- bool success; \
+ va_list args; \
+ int needed; \
va_start(args, fmt); \
- success = appendStringInfoVA(&buf, fmtbuf, args); \
+ needed = appendStringInfoVA(&buf, fmtbuf, args); \
va_end(args); \
- if (success) \
+ if (!needed) \
break; \
enlargeStringInfo(&buf, buf.maxlen); \
} \
@@ -757,12 +757,12 @@ errcode_for_socket_access(void)
/* Generate actual output --- have to use appendStringInfoVA */ \
for (;;) \
{ \
- va_list args; \
- bool success; \
+ va_list args; \
+ int needed; \
va_start(args, n); \
- success = appendStringInfoVA(&buf, fmtbuf, args); \
+ needed = appendStringInfoVA(&buf, fmtbuf, args); \
va_end(args); \
- if (success) \
+ if (!needed) \
break; \
enlargeStringInfo(&buf, buf.maxlen); \
} \
diff --git a/src/include/lib/stringinfo.h b/src/include/lib/stringinfo.h
index ecb693f..f6b14ac 100644
--- a/src/include/lib/stringinfo.h
+++ b/src/include/lib/stringinfo.h
@@ -97,15 +97,20 @@ appendStringInfo(StringInfo str, const char *fmt,...)
/* This extension allows gcc to check the format string */
__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3)));
-/*------------------------
+ /*------------------------
* appendStringInfoVA
+ *
* Attempt to format text data under the control of fmt (an sprintf-style
* format string) and append it to whatever is already in str. If successful
- * return true; if not (because there's not enough space), return false
- * without modifying str. Typically the caller would enlarge str and retry
- * on false return --- see appendStringInfo for standard usage pattern.
+ * return 0; if not (because there's not enough space), return a positive
+ * number. This number is the return value of the call to vsnprintf.
+ * Implementations of vsnprintf vary and this value may NOT be the length
+ * required for the whole string, so callers should take this value with a
+ * pinch of salt and treat the value returned as a hint that the space
+ * required might be bigger than they would think.
+ * --- see appendStringInfo for standard usage pattern.
*/
-extern bool
+extern int
appendStringInfoVA(StringInfo str, const char *fmt, va_list args)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 0)));
Florian Weimer <fweimer@redhat.com> writes:
Do you care about the snprintf behavior on very large buffers (larger
than INT_MAX)? Then there's further complication, and it's an area
where glibc behavior is likely to change in the future (because it is
claimed that C99 and POSIX conflict, and glibc implements neither behavior).
We do not. Note that the buffer enlargement behavior is designed not to
let "len" exceed INT_MAX; it'll say "out of memory" instead.
Given that vsnprintf is defined to return int, buffers larger than INT_MAX
would be a real can of worms, one that we'd best not open.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:
On Wed, Oct 23, 2013 at 4:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah. As a separate matter, it might be useful to revise stringinfo.c
and the asprintf code so that *if* the returned value is larger than the
given buffer size, we use it as a guide to resizing, avoiding the possible
need to loop multiple times to make the buffer large enough. And we could
also improve our own implementation of snprintf to follow the C99 spec.
Attached is a draft patch which implements this.
I started working on a very similar patch last night, but then began to
wonder if it wouldn't be better to try to share code between stringinfo.c
and psprintf.c --- that is, expose the latter's pvsnprintf and use that
in stringinfo.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers