[PATCH] Make jsonapi usable from libpq
(This is split off from my work on OAUTHBEARER [1]/messages/by-id/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com.)
The jsonapi in src/common can't currently be compiled into libpq. The
first patch here removes the dependency on pg_log_fatal(), which is not
available to the sharedlib. The second patch makes sure that all of the
return values from json_errdetail() can be pfree'd, to avoid long-
running leaks.
In the original thread, Michael Paquier commented:
+# define check_stack_depth() +# ifdef JSONAPI_NO_LOG +# define json_log_and_abort(...) \ + do { fprintf(stderr, __VA_ARGS__); exit(1); } while(0) +# else In patch 0002, this is the wrong approach. libpq will not be able to feed on such reports, and you cannot use any of the APIs from the palloc() family either as these just fail on OOM. libpq should be able to know about the error, and would fill in the error back to the application. This abstraction is not necessary on HEAD as pg_verifybackup is fine with this level of reporting. My rough guess is that we will need to split the existing jsonapi.c into two files, one that can be used in shared libraries and a second that handles the errors.
Hmm. I'm honestly hesitant to start splitting files apart, mostly
because json_log_and_abort() is only called from two places, and both
those places are triggered by programmer error as opposed to user
error.
Would it make more sense to switch to an fprintf-and-abort case, to
match the approach taken by PGTHREAD_ERROR and the out-of-memory
conditions in fe-print.c? Or is there already precedent for handling
can't-happen code paths with in-band errors, through the PGconn?
--Jacob
[1]: /messages/by-id/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com
Attachments:
0001-src-common-remove-logging-from-jsonapi-for-shlib.patchtext/x-patch; name=0001-src-common-remove-logging-from-jsonapi-for-shlib.patchDownload
From 0541598e4f0bad1b9ff41a4640ec69491b393d54 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 3 May 2021 11:15:15 -0700
Subject: [PATCH 2/7] src/common: remove logging from jsonapi for shlib
The can't-happen code in jsonapi was pulling in logging code, which for
libpq is not included.
---
src/common/Makefile | 4 ++++
src/common/jsonapi.c | 11 ++++++++---
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/common/Makefile b/src/common/Makefile
index 38a8599337..6f1039bc78 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -28,6 +28,10 @@ subdir = src/common
top_builddir = ../..
include $(top_builddir)/src/Makefile.global
+# For use in shared libraries, jsonapi needs to not link in any logging
+# functions.
+override CFLAGS_SL += -DJSONAPI_NO_LOG
+
# don't include subdirectory-path-dependent -I and -L switches
STD_CPPFLAGS := $(filter-out -I$(top_srcdir)/src/include -I$(top_builddir)/src/include,$(CPPFLAGS))
STD_LDFLAGS := $(filter-out -L$(top_builddir)/src/common -L$(top_builddir)/src/port,$(LDFLAGS))
diff --git a/src/common/jsonapi.c b/src/common/jsonapi.c
index 1bf38d7b42..6b6001b118 100644
--- a/src/common/jsonapi.c
+++ b/src/common/jsonapi.c
@@ -27,11 +27,16 @@
#endif
#ifdef FRONTEND
-#define check_stack_depth()
-#define json_log_and_abort(...) \
+# define check_stack_depth()
+# ifdef JSONAPI_NO_LOG
+# define json_log_and_abort(...) \
+ do { fprintf(stderr, __VA_ARGS__); exit(1); } while(0)
+# else
+# define json_log_and_abort(...) \
do { pg_log_fatal(__VA_ARGS__); exit(1); } while(0)
+# endif
#else
-#define json_log_and_abort(...) elog(ERROR, __VA_ARGS__)
+# define json_log_and_abort(...) elog(ERROR, __VA_ARGS__)
#endif
/*
--
2.25.1
0002-common-jsonapi-always-palloc-the-error-strings.patchtext/x-patch; name=0002-common-jsonapi-always-palloc-the-error-strings.patchDownload
From 5ad4b3c7835fe9e0f284702ec7b827c27770854e Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 3 May 2021 15:38:26 -0700
Subject: [PATCH 3/7] common/jsonapi: always palloc the error strings
...so that client code can pfree() to avoid memory leaks in long-running
operations.
---
src/common/jsonapi.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/src/common/jsonapi.c b/src/common/jsonapi.c
index 6b6001b118..f7304f584f 100644
--- a/src/common/jsonapi.c
+++ b/src/common/jsonapi.c
@@ -1089,7 +1089,7 @@ json_errdetail(JsonParseErrorType error, JsonLexContext *lex)
return psprintf(_("Expected JSON value, but found \"%s\"."),
extract_token(lex));
case JSON_EXPECTED_MORE:
- return _("The input string ended unexpectedly.");
+ return pstrdup(_("The input string ended unexpectedly."));
case JSON_EXPECTED_OBJECT_FIRST:
return psprintf(_("Expected string or \"}\", but found \"%s\"."),
extract_token(lex));
@@ -1103,16 +1103,16 @@ json_errdetail(JsonParseErrorType error, JsonLexContext *lex)
return psprintf(_("Token \"%s\" is invalid."),
extract_token(lex));
case JSON_UNICODE_CODE_POINT_ZERO:
- return _("\\u0000 cannot be converted to text.");
+ return pstrdup(_("\\u0000 cannot be converted to text."));
case JSON_UNICODE_ESCAPE_FORMAT:
- return _("\"\\u\" must be followed by four hexadecimal digits.");
+ return pstrdup(_("\"\\u\" must be followed by four hexadecimal digits."));
case JSON_UNICODE_HIGH_ESCAPE:
/* note: this case is only reachable in frontend not backend */
- return _("Unicode escape values cannot be used for code point values above 007F when the encoding is not UTF8.");
+ return pstrdup(_("Unicode escape values cannot be used for code point values above 007F when the encoding is not UTF8."));
case JSON_UNICODE_HIGH_SURROGATE:
- return _("Unicode high surrogate must not follow a high surrogate.");
+ return pstrdup(_("Unicode high surrogate must not follow a high surrogate."));
case JSON_UNICODE_LOW_SURROGATE:
- return _("Unicode low surrogate must follow a high surrogate.");
+ return pstrdup(_("Unicode low surrogate must follow a high surrogate."));
}
/*
--
2.25.1
On Tue, Jun 22, 2021 at 10:59:37PM +0000, Jacob Champion wrote:
Hmm. I'm honestly hesitant to start splitting files apart, mostly
because json_log_and_abort() is only called from two places, and both
those places are triggered by programmer error as opposed to user
error.Would it make more sense to switch to an fprintf-and-abort case, to
match the approach taken by PGTHREAD_ERROR and the out-of-memory
conditions in fe-print.c? Or is there already precedent for handling
can't-happen code paths with in-band errors, through the PGconn?
Not really..
Looking more closely at that, I actually find a bit crazy the
requirement for any logging within jsonapi.c just to cope with the
fact that json_errdetail() and report_parse_error() just want to track
down if the caller is giving some garbage or not, which should never
be the case, really. So I would be tempted to eliminate this
dependency to begin with.
The second thing is how we should try to handle the way the error
message gets allocated in json_errdetail(). libpq cannot rely on
psprintf(), so I can think about two options here:
- Let the caller of json_errdetail() allocate the memory area of the
error message by itself, pass it down to the function.
- Do the allocation within json_errdetail(), and let callers cope the
case where json_errdetail() itself fails on OOM for any frontend code
using it.
Looking at HEAD, the OAUTH patch and the token handling, the second
option looks more interesting. I have to admit that handling the
token part makes the patch a bit special, but that avoids duplicating
all those error messages for libpq. Please see the idea as attached.
--
Michael
Attachments:
jsonapi-libpq.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/include/common/jsonapi.h b/src/include/common/jsonapi.h
index ec3dfce9c3..f5e25ca3e3 100644
--- a/src/include/common/jsonapi.h
+++ b/src/include/common/jsonapi.h
@@ -48,6 +48,7 @@ typedef enum
JSON_EXPECTED_OBJECT_NEXT,
JSON_EXPECTED_STRING,
JSON_INVALID_TOKEN,
+ JSON_OUT_OF_MEMORY,
JSON_UNICODE_CODE_POINT_ZERO,
JSON_UNICODE_ESCAPE_FORMAT,
JSON_UNICODE_HIGH_ESCAPE,
diff --git a/src/common/jsonapi.c b/src/common/jsonapi.c
index d376ab152d..cf70a735cc 100644
--- a/src/common/jsonapi.c
+++ b/src/common/jsonapi.c
@@ -19,19 +19,22 @@
#include "common/jsonapi.h"
#include "mb/pg_wchar.h"
-
-#ifdef FRONTEND
-#include "common/logging.h"
-#else
+#ifndef FRONTEND
#include "miscadmin.h"
#endif
-#ifdef FRONTEND
-#define check_stack_depth()
-#define json_log_and_abort(...) \
- do { pg_log_fatal(__VA_ARGS__); exit(1); } while(0)
+/*
+ * In backend, we will use palloc/pfree. In frontend, use malloc, and
+ * return JSON_OUT_OF_MEMORY on out-of-memory.
+ */
+#ifndef FRONTEND
+#define STRDUP(s) pstrdup(s)
+#define ALLOC(size) palloc(size)
+#define FREE(s) pfree(s)
#else
-#define json_log_and_abort(...) elog(ERROR, __VA_ARGS__)
+#define STRDUP(s) strdup(s)
+#define ALLOC(size) malloc(size)
+#define FREE(s) free(s)
#endif
/*
@@ -291,12 +294,17 @@ parse_scalar(JsonLexContext *lex, JsonSemAction *sem)
if (lex_peek(lex) == JSON_TOKEN_STRING)
{
if (lex->strval != NULL)
- val = pstrdup(lex->strval->data);
+ {
+ val = STRDUP(lex->strval->data);
+ if (val == NULL)
+ return JSON_OUT_OF_MEMORY;
+ }
}
else
{
int len = (lex->token_terminator - lex->token_start);
+ //allocation here
val = palloc(len + 1);
memcpy(val, lex->token_start, len);
val[len] = '\0';
@@ -332,7 +340,11 @@ parse_object_field(JsonLexContext *lex, JsonSemAction *sem)
if (lex_peek(lex) != JSON_TOKEN_STRING)
return report_parse_error(JSON_PARSE_STRING, lex);
if ((ostart != NULL || oend != NULL) && lex->strval != NULL)
- fname = pstrdup(lex->strval->data);
+ {
+ fname = STRDUP(lex->strval->data);
+ if (fname == NULL)
+ return JSON_OUT_OF_MEMORY;
+ }
result = json_lex(lex);
if (result != JSON_SUCCESS)
return result;
@@ -378,7 +390,9 @@ parse_object(JsonLexContext *lex, JsonSemAction *sem)
JsonTokenType tok;
JsonParseErrorType result;
+#ifndef FRONTEND
check_stack_depth();
+#endif
if (ostart != NULL)
(*ostart) (sem->semstate);
@@ -478,7 +492,9 @@ parse_array(JsonLexContext *lex, JsonSemAction *sem)
json_struct_action aend = sem->array_end;
JsonParseErrorType result;
+#ifndef FRONTEND
check_stack_depth();
+#endif
if (astart != NULL)
(*astart) (sem->semstate);
@@ -1047,76 +1063,138 @@ report_parse_error(JsonParseContext ctx, JsonLexContext *lex)
* unhandled enum values. But this needs to be here anyway to cover the
* possibility of an incorrect input.
*/
- json_log_and_abort("unexpected json parse state: %d", (int) ctx);
return JSON_SUCCESS; /* silence stupider compilers */
}
/*
- * Construct a detail message for a JSON error.
+ * Construct a detail message for a JSON error, palloc'd or malloc'd.
+ * note that this may return NULL on OOM in the frontend.
*/
char *
json_errdetail(JsonParseErrorType error, JsonLexContext *lex)
{
+ char *errstr;
+ char *tok = NULL;
+
+ /* allocate a string large enough to include the error */
+ errstr = ALLOC(256 * sizeof(char));
+ if (errstr == NULL)
+ return NULL;
+
+ /* Grab current token from the lexing context, if necessary */
+ switch (error)
+ {
+ case JSON_ESCAPING_INVALID:
+ case JSON_EXPECTED_ARRAY_FIRST:
+ case JSON_EXPECTED_ARRAY_NEXT:
+ case JSON_EXPECTED_COLON:
+ case JSON_EXPECTED_END:
+ case JSON_EXPECTED_JSON:
+ case JSON_EXPECTED_OBJECT_FIRST:
+ case JSON_EXPECTED_OBJECT_NEXT:
+ case JSON_EXPECTED_STRING:
+ case JSON_INVALID_TOKEN:
+ tok = extract_token(lex);
+ /* if allocating the token failed, just give up */
+ if (tok == NULL)
+ return NULL;
+ break;
+ default:
+ break;
+ }
+
switch (error)
{
case JSON_SUCCESS:
/* fall through to the error code after switch */
break;
case JSON_ESCAPING_INVALID:
- return psprintf(_("Escape sequence \"\\%s\" is invalid."),
- extract_token(lex));
+ sprintf(errstr,
+ _("Escape sequence \"\\%s\" is invalid."),
+ tok);
+ break;
case JSON_ESCAPING_REQUIRED:
- return psprintf(_("Character with value 0x%02x must be escaped."),
- (unsigned char) *(lex->token_terminator));
+ sprintf(errstr,
+ _("Character with value 0x%02x must be escaped."),
+ (unsigned char) *(lex->token_terminator));
+ break;
case JSON_EXPECTED_END:
- return psprintf(_("Expected end of input, but found \"%s\"."),
- extract_token(lex));
+ sprintf(errstr,
+ _("Expected end of input, but found \"%s\"."), tok);
+ break;
case JSON_EXPECTED_ARRAY_FIRST:
- return psprintf(_("Expected array element or \"]\", but found \"%s\"."),
- extract_token(lex));
+ sprintf(errstr,
+ _("Expected array element or \"]\", but found \"%s\"."),
+ tok);
+ break;
case JSON_EXPECTED_ARRAY_NEXT:
- return psprintf(_("Expected \",\" or \"]\", but found \"%s\"."),
- extract_token(lex));
+ sprintf(errstr,
+ _("Expected \",\" or \"]\", but found \"%s\"."),
+ tok);
+ break;
case JSON_EXPECTED_COLON:
- return psprintf(_("Expected \":\", but found \"%s\"."),
- extract_token(lex));
+ sprintf(errstr,
+ _("Expected \":\", but found \"%s\"."),
+ tok);
+ break;
case JSON_EXPECTED_JSON:
- return psprintf(_("Expected JSON value, but found \"%s\"."),
- extract_token(lex));
+ sprintf(errstr,
+ _("Expected JSON value, but found \"%s\"."),
+ tok);
+ break;
case JSON_EXPECTED_MORE:
- return _("The input string ended unexpectedly.");
+ sprintf(errstr,
+ _("The input string ended unexpectedly."));
+ break;
case JSON_EXPECTED_OBJECT_FIRST:
- return psprintf(_("Expected string or \"}\", but found \"%s\"."),
- extract_token(lex));
+ sprintf(errstr,
+ _("Expected string or \"}\", but found \"%s\"."),
+ tok);
+ break;
case JSON_EXPECTED_OBJECT_NEXT:
- return psprintf(_("Expected \",\" or \"}\", but found \"%s\"."),
- extract_token(lex));
+ sprintf(errstr,
+ _("Expected \",\" or \"}\", but found \"%s\"."),
+ tok);
+ break;
case JSON_EXPECTED_STRING:
- return psprintf(_("Expected string, but found \"%s\"."),
- extract_token(lex));
+ sprintf(errstr,
+ _("Expected string, but found \"%s\"."),
+ tok);
+ break;
case JSON_INVALID_TOKEN:
- return psprintf(_("Token \"%s\" is invalid."),
- extract_token(lex));
+ sprintf(errstr,
+ _("Token \"%s\" is invalid."),
+ tok);
+ break;
+ case JSON_OUT_OF_MEMORY:
+ sprintf(errstr, _("out of memory."));
+ break;
+
case JSON_UNICODE_CODE_POINT_ZERO:
- return _("\\u0000 cannot be converted to text.");
+ sprintf(errstr, _("\\u0000 cannot be converted to text."));
+ break;
case JSON_UNICODE_ESCAPE_FORMAT:
- return _("\"\\u\" must be followed by four hexadecimal digits.");
+ sprintf(errstr,
+ _("\"\\u\" must be followed by four hexadecimal digits."));
+ break;
case JSON_UNICODE_HIGH_ESCAPE:
/* note: this case is only reachable in frontend not backend */
- return _("Unicode escape values cannot be used for code point values above 007F when the encoding is not UTF8.");
+ sprintf(errstr,
+ _("Unicode escape values cannot be used for code point values above 007F when the encoding is not UTF8."));
+ break;
case JSON_UNICODE_HIGH_SURROGATE:
- return _("Unicode high surrogate must not follow a high surrogate.");
+ sprintf(errstr,
+ _("Unicode high surrogate must not follow a high surrogate."));
+ break;
case JSON_UNICODE_LOW_SURROGATE:
- return _("Unicode low surrogate must follow a high surrogate.");
+ sprintf(errstr,
+ _("Unicode low surrogate must follow a high surrogate."));
+ break;
}
- /*
- * We don't use a default: case, so that the compiler will warn about
- * unhandled enum values. But this needs to be here anyway to cover the
- * possibility of an incorrect input.
- */
- json_log_and_abort("unexpected json parse error type: %d", (int) error);
- return NULL; /* silence stupider compilers */
+ if (tok)
+ FREE(tok);
+ return errstr;
}
/*
@@ -1126,7 +1204,14 @@ static char *
extract_token(JsonLexContext *lex)
{
int toklen = lex->token_terminator - lex->token_start;
- char *token = palloc(toklen + 1);
+ char *token = ALLOC(toklen + 1);
+
+ /*
+ * There is not much we can do here on OOM, so just return some dummy
+ * value.
+ */
+ if (token == NULL)
+ return NULL;
memcpy(token, lex->token_start, toklen);
token[toklen] = '\0';
diff --git a/src/bin/pg_verifybackup/parse_manifest.c b/src/bin/pg_verifybackup/parse_manifest.c
index 3b13ae5b84..e31a05dc35 100644
--- a/src/bin/pg_verifybackup/parse_manifest.c
+++ b/src/bin/pg_verifybackup/parse_manifest.c
@@ -147,7 +147,10 @@ json_parse_manifest(JsonManifestParseContext *context, char *buffer,
/* Run the actual JSON parser. */
json_error = pg_parse_json(lex, &sem);
if (json_error != JSON_SUCCESS)
- json_manifest_parse_failure(context, json_errdetail(json_error, lex));
+ {
+ char *errstr = json_errdetail(json_error, lex);
+ json_manifest_parse_failure(context, errstr ? errstr : "out of memory");
+ }
if (parse.state != JM_EXPECT_EOF)
json_manifest_parse_failure(context, "manifest ended unexpectedly");
On Thu, 2021-06-24 at 14:56 +0900, Michael Paquier wrote:
Looking more closely at that, I actually find a bit crazy the
requirement for any logging within jsonapi.c just to cope with the
fact that json_errdetail() and report_parse_error() just want to track
down if the caller is giving some garbage or not, which should never
be the case, really. So I would be tempted to eliminate this
dependency to begin with.
I think that's a good plan.
The second thing is how we should try to handle the way the error
message gets allocated in json_errdetail(). libpq cannot rely on
psprintf(),
That surprised me. So there's currently no compiler-enforced
prohibition, just a policy? It looks like the bar was lowered a little
bit in commit c0cb87fbb6, as libpq currently has a symbol dependency on
pg_get_line_buf() and pfree() on my machine.
, so I can think about two options here:
- Let the caller of json_errdetail() allocate the memory area of the
error message by itself, pass it down to the function.
- Do the allocation within json_errdetail(), and let callers cope the
case where json_errdetail() itself fails on OOM for any frontend code
using it.Looking at HEAD, the OAUTH patch and the token handling, the second
option looks more interesting. I have to admit that handling the
token part makes the patch a bit special, but that avoids duplicating
all those error messages for libpq. Please see the idea as attached.
I prefer the second approach as well. Looking at the sample
implementation -- has an allocating sprintf() for libpq really not been
implemented before? Doing it ourselves on the stack gives me some
heartburn; at the very least we'll have to make careful use of snprintf
so as to not smash the stack while parsing malicious JSON.
If our libpq-specific implementation is going to end up returning NULL
on bad allocation anyway, could we just modify the behavior of the
existing front-end palloc implementation to not exit() from inside
libpq? That would save a lot of one-off code for future implementers.
--Jacob
On Fri, Jun 25, 2021 at 08:58:46PM +0000, Jacob Champion wrote:
On Thu, 2021-06-24 at 14:56 +0900, Michael Paquier wrote:
Looking more closely at that, I actually find a bit crazy the
requirement for any logging within jsonapi.c just to cope with the
fact that json_errdetail() and report_parse_error() just want to track
down if the caller is giving some garbage or not, which should never
be the case, really. So I would be tempted to eliminate this
dependency to begin with.I think that's a good plan.
We could do this cleanup first, as an independent patch. That's
simple enough. I am wondering if we'd better do this bit in 14
actually, so as the divergence between 15~ and 14 is lightly
minimized.
The second thing is how we should try to handle the way the error
message gets allocated in json_errdetail(). libpq cannot rely on
psprintf(),That surprised me. So there's currently no compiler-enforced
prohibition, just a policy? It looks like the bar was lowered a little
bit in commit c0cb87fbb6, as libpq currently has a symbol dependency on
pg_get_line_buf() and pfree() on my machine.
Good point. That's worse than just pfree() which is just a plain call
to free() in the frontend. We could have more policies here, but my
take is that we'd better move fe_memutils.o to OBJS_FRONTEND in
src/common/Makefile so as shared libraries don't use those routines in
the long term.
In parseServiceFile(), initStringInfo() does a palloc() which would
simply exit() on OOM, in libpq. That's not good. The service file
parsing is the only piece in libpq using StringInfoData. @Tom,
@Daniel, you got involved in c0cb87f. It looks like this piece about
the limitations with service file parsing needs a rework. This code
is new in 14, which means a new open item.
If our libpq-specific implementation is going to end up returning NULL
on bad allocation anyway, could we just modify the behavior of the
existing front-end palloc implementation to not exit() from inside
libpq? That would save a lot of one-off code for future implementers.
Yeah, a side effect of that is to enforce a new rule for any frontend
code that calls palloc(), and these could be easily exposed to crashes
within knowing about it until their system is under resource
pressure. Silent breakages with very old guaranteed behaviors is
bad.
--
Michael
On 26 Jun 2021, at 02:36, Michael Paquier <michael@paquier.xyz> wrote:
The service file parsing is the only piece in libpq using StringInfoData.
@Tom, @Daniel, you got involved in c0cb87f. It looks like this piece about the
limitations with service file parsing needs a rework. This code is new in 14,
which means a new open item.
Reworking it at this point to use a pqexpbuffer would be too invasive for 14
IMO, so reverting this part seems like the best option, and then redo it with
a pqexpbuffer for 15.
--
Daniel Gustafsson https://vmware.com/
Michael Paquier <michael@paquier.xyz> writes:
On Fri, Jun 25, 2021 at 08:58:46PM +0000, Jacob Champion wrote:
That surprised me. So there's currently no compiler-enforced
prohibition, just a policy? It looks like the bar was lowered a little
bit in commit c0cb87fbb6, as libpq currently has a symbol dependency on
pg_get_line_buf() and pfree() on my machine.
Good point. That's worse than just pfree() which is just a plain call
to free() in the frontend. We could have more policies here, but my
take is that we'd better move fe_memutils.o to OBJS_FRONTEND in
src/common/Makefile so as shared libraries don't use those routines in
the long term.
Ugh. Not only is that bad, but your proposed fix doesn't fix it.
At least in psql, and probably in most/all of our other clients,
removing fe_memutils.o from libpq's link just causes it to start
relying on the copy in the psql executable :-(. So I agree that
some sort of mechanical enforcement would be a really good thing,
but I'm not sure what it would look like.
In parseServiceFile(), initStringInfo() does a palloc() which would
simply exit() on OOM, in libpq. That's not good. The service file
parsing is the only piece in libpq using StringInfoData. @Tom,
@Daniel, you got involved in c0cb87f.
I concur with Daniel that the easiest fix for v14 is to revert
c0cb87f. Allowing unlimited-length lines in the service file seems
like a nice-to-have, but it's not worth a lot. (Looking at the patch,
I'm inclined to keep much of the code rearrangement, just remove the
dependency on stringinfo.c. Also I'm tempted to set the fixed buffer
size at 1024 not 256, after which we might never need to improve it.)
I spent some time looking for other undesirable symbol dependencies
in libpq, and soon found a couple. PGTHREAD_ERROR potentially calls
abort(), which seems even worse than exit-on-OOM, although I don't
think we've ever heard a report of that being hit. Also,
fe-print.c's handling of OOM isn't nice at all:
fprintf(stderr, libpq_gettext("out of memory\n"));
abort();
Although fe-print.c is semi-deprecated, it still seems like it'd
be a good idea to clean that up.
BTW, so far as the original topic of this thread is concerned,
it sounds like Jacob's ultimate goal is to put some functionality
into libpq that requires JSON parsing. I'm going to say up front
that that sounds like a terrible idea. As we've just seen, libpq
operates under very tight constraints, not all of which are
mechanically enforced. I am really doubtful that anything that
would require a JSON parser has any business being in libpq.
Unless you can sell us on that point, I do not think it's worth
complicating the src/common JSON code to the point where it can
work under libpq's constraints.
regards, tom lane
I wrote:
I spent some time looking for other undesirable symbol dependencies
in libpq, and soon found a couple. PGTHREAD_ERROR potentially calls
abort(), which seems even worse than exit-on-OOM, although I don't
think we've ever heard a report of that being hit. Also,
fe-print.c's handling of OOM isn't nice at all:
fprintf(stderr, libpq_gettext("out of memory\n"));
abort();
Although fe-print.c is semi-deprecated, it still seems like it'd
be a good idea to clean that up.
fe-print.c seems easy enough to clean up, as per attached.
Not real sure what to do about PGTHREAD_ERROR.
regards, tom lane
Attachments:
avoid-calling-abort-in-fe-print.c-1.patchtext/x-diff; charset=us-ascii; name=avoid-calling-abort-in-fe-print.c-1.patchDownload
diff --git a/src/interfaces/libpq/fe-print.c b/src/interfaces/libpq/fe-print.c
index fc7d84844e..0831950b12 100644
--- a/src/interfaces/libpq/fe-print.c
+++ b/src/interfaces/libpq/fe-print.c
@@ -37,7 +37,7 @@
#include "libpq-int.h"
-static void do_field(const PQprintOpt *po, const PGresult *res,
+static bool do_field(const PQprintOpt *po, const PGresult *res,
const int i, const int j, const int fs_len,
char **fields,
const int nFields, const char **fieldNames,
@@ -80,12 +80,12 @@ PQprint(FILE *fout, const PGresult *res, const PQprintOpt *po)
unsigned char *fieldNotNum = NULL;
char *border = NULL;
char **fields = NULL;
- const char **fieldNames;
+ const char **fieldNames = NULL;
int fieldMaxLen = 0;
int numFieldName;
int fs_len = strlen(po->fieldSep);
int total_line_length = 0;
- int usePipe = 0;
+ bool usePipe = false;
char *pagerenv;
#if defined(ENABLE_THREAD_SAFETY) && !defined(WIN32)
@@ -111,17 +111,17 @@ PQprint(FILE *fout, const PGresult *res, const PQprintOpt *po)
if (!(fieldNames = (const char **) calloc(nFields, sizeof(char *))))
{
fprintf(stderr, libpq_gettext("out of memory\n"));
- abort();
+ goto exit;
}
if (!(fieldNotNum = (unsigned char *) calloc(nFields, 1)))
{
fprintf(stderr, libpq_gettext("out of memory\n"));
- abort();
+ goto exit;
}
if (!(fieldMax = (int *) calloc(nFields, sizeof(int))))
{
fprintf(stderr, libpq_gettext("out of memory\n"));
- abort();
+ goto exit;
}
for (numFieldName = 0;
po->fieldName && po->fieldName[numFieldName];
@@ -190,7 +190,7 @@ PQprint(FILE *fout, const PGresult *res, const PQprintOpt *po)
fout = popen(pagerenv, "w");
if (fout)
{
- usePipe = 1;
+ usePipe = true;
#ifndef WIN32
#ifdef ENABLE_THREAD_SAFETY
if (pq_block_sigpipe(&osigset, &sigpipe_pending) == 0)
@@ -207,10 +207,11 @@ PQprint(FILE *fout, const PGresult *res, const PQprintOpt *po)
if (!po->expanded && (po->align || po->html3))
{
- if (!(fields = (char **) calloc(nFields * (nTups + 1), sizeof(char *))))
+ if (!(fields = (char **) calloc((size_t) nTups + 1,
+ nFields * sizeof(char *))))
{
fprintf(stderr, libpq_gettext("out of memory\n"));
- abort();
+ goto exit;
}
}
else if (po->header && !po->html3)
@@ -264,9 +265,12 @@ PQprint(FILE *fout, const PGresult *res, const PQprintOpt *po)
fprintf(fout, libpq_gettext("-- RECORD %d --\n"), i);
}
for (j = 0; j < nFields; j++)
- do_field(po, res, i, j, fs_len, fields, nFields,
- fieldNames, fieldNotNum,
- fieldMax, fieldMaxLen, fout);
+ {
+ if (!do_field(po, res, i, j, fs_len, fields, nFields,
+ fieldNames, fieldNotNum,
+ fieldMax, fieldMaxLen, fout))
+ goto exit;
+ }
if (po->html3 && po->expanded)
fputs("</table>\n", fout);
}
@@ -297,18 +301,34 @@ PQprint(FILE *fout, const PGresult *res, const PQprintOpt *po)
for (i = 0; i < nTups; i++)
output_row(fout, po, nFields, fields,
fieldNotNum, fieldMax, border, i);
- free(fields);
- if (border)
- free(border);
}
if (po->header && !po->html3)
fprintf(fout, "(%d row%s)\n\n", PQntuples(res),
(PQntuples(res) == 1) ? "" : "s");
if (po->html3 && !po->expanded)
fputs("</table>\n", fout);
- free(fieldMax);
- free(fieldNotNum);
- free((void *) fieldNames);
+
+exit:
+ if (fieldMax)
+ free(fieldMax);
+ if (fieldNotNum)
+ free(fieldNotNum);
+ if (border)
+ free(border);
+ if (fields)
+ {
+ /* if calloc succeeded, this shouldn't overflow size_t */
+ size_t numfields = ((size_t) nTups + 1) * (size_t) nFields;
+
+ while (numfields-- > 0)
+ {
+ if (fields[numfields])
+ free(fields[numfields]);
+ }
+ free(fields);
+ }
+ if (fieldNames)
+ free((void *) fieldNames);
if (usePipe)
{
#ifdef WIN32
@@ -329,7 +349,7 @@ PQprint(FILE *fout, const PGresult *res, const PQprintOpt *po)
}
-static void
+static bool
do_field(const PQprintOpt *po, const PGresult *res,
const int i, const int j, const int fs_len,
char **fields,
@@ -397,7 +417,7 @@ do_field(const PQprintOpt *po, const PGresult *res,
if (!(fields[i * nFields + j] = (char *) malloc(plen + 1)))
{
fprintf(stderr, libpq_gettext("out of memory\n"));
- abort();
+ return false;
}
strcpy(fields[i * nFields + j], pval);
}
@@ -440,6 +460,7 @@ do_field(const PQprintOpt *po, const PGresult *res,
}
}
}
+ return true;
}
@@ -467,7 +488,7 @@ do_header(FILE *fout, const PQprintOpt *po, const int nFields, int *fieldMax,
if (!border)
{
fprintf(stderr, libpq_gettext("out of memory\n"));
- abort();
+ return NULL;
}
p = border;
if (po->standard)
@@ -558,8 +579,6 @@ output_row(FILE *fout, const PQprintOpt *po, const int nFields, char **fields,
if (po->standard || field_index + 1 < nFields)
fputs(po->fieldSep, fout);
}
- if (p)
- free(p);
}
if (po->html3)
fputs("</tr>", fout);
@@ -609,7 +628,7 @@ PQdisplayTuples(const PGresult *res,
if (!fLength)
{
fprintf(stderr, libpq_gettext("out of memory\n"));
- abort();
+ return;
}
for (j = 0; j < nFields; j++)
@@ -707,7 +726,7 @@ PQprintTuples(const PGresult *res,
if (!tborder)
{
fprintf(stderr, libpq_gettext("out of memory\n"));
- abort();
+ return;
}
for (i = 0; i < width; i++)
tborder[i] = '-';
I wrote:
Not real sure what to do about PGTHREAD_ERROR.
I wonder if we shouldn't simply nuke that macro and change the
call sites into "Assert(false)". The only call sites are in
default_threadlock() (used in fe_auth.c) and pq_lockingcallback()
(for OpenSSL). I suggest that
1. "fprintf(stderr)" in these locking functions doesn't seem
remarkably well-advised either. Especially not on Windows;
but in general, we don't expect libpq to emit stuff on stderr
except under *very* limited circumstances.
2. In an assert-enabled build, Assert() ought to be about equivalent
to abort().
3. In a production build, if one of these mutex calls fails, ignoring
the failure might be the best thing to do anyway. Certainly, dumping
core is the worst possible outcome, while not doing anything would
have no impact except in the rather-unlikely case that multiple libpq
connections try to use this code concurrently.
It's certainly possible to quibble about point 3, but unless you
have a better alternative to offer, I don't think you have a lot
of room to complain.
regards, tom lane
On Sat, Jun 26, 2021 at 01:43:50PM -0400, Tom Lane wrote:
BTW, so far as the original topic of this thread is concerned,
it sounds like Jacob's ultimate goal is to put some functionality
into libpq that requires JSON parsing. I'm going to say up front
that that sounds like a terrible idea. As we've just seen, libpq
operates under very tight constraints, not all of which are
mechanically enforced. I am really doubtful that anything that
would require a JSON parser has any business being in libpq.
Unless you can sell us on that point, I do not think it's worth
complicating the src/common JSON code to the point where it can
work under libpq's constraints.
AFAIK, the SASL mechanism OAUTHBEARER described in RFC 7628 would
require such facilities as failures are reported in this format:
https://datatracker.ietf.org/doc/html/rfc7628
Perhaps you are right and we have no need to do any json parsing in
libpq as long as we pass down the JSON blob, but I am not completely
sure if we can avoid that either.
Separate topic: I find disturbing the dependency of jsonapi.c to
the logging parts just to cope with dummy error values that are
originally part of JsonParseErrorType.
--
Michael
I wrote:
Not real sure what to do about PGTHREAD_ERROR.
I wonder if we shouldn't simply nuke that macro and change the
call sites into "Assert(false)".
After further study this still seems like the best available choice.
We do not have the option to make either default_threadlock() or
pq_lockingcallback() do something saner, like return a failure
indication. pq_lockingcallback()'s API is dictated by OpenSSL,
while default_threadlock()'s API is exposed to users by libpq
(IOW, we could have gotten that one right years ago, but we
failed to, and it seems much too late to change it now).
Also, I trawled the mailing list archives, and I can find no
indication that any of the PGTHREAD_ERROR messages have ever
been seen in the field. The last relevant discussion seems
to be in
/messages/by-id/20130801142443.GO2706@tamriel.snowman.net
where it was observed that this code isn't very well thought
through :-(
My proposal is to replace PGTHREAD_ERROR by Assert(false)
in HEAD, but leave things alone in the back branches.
As far as the other patch to check for mistakes with "nm"
goes, we could either do nothing in the back branches, or
install a check for "exit" only, not "abort". But there's
probably no real need for such a check in the back branches
as long as we're enforcing it in HEAD.
regards, tom lane
On 28 Jun 2021, at 21:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
Not real sure what to do about PGTHREAD_ERROR.
I wonder if we shouldn't simply nuke that macro and change the
call sites into "Assert(false)".After further study this still seems like the best available choice.
While this solution has a potential downside as you mention upthread, I can't
see any better alternative, and this is clearly better than what we have now.
My proposal is to replace PGTHREAD_ERROR by Assert(false)
in HEAD, but leave things alone in the back branches.
+1
As far as the other patch to check for mistakes with "nm"
goes, we could either do nothing in the back branches, or
install a check for "exit" only, not "abort". But there's
probably no real need for such a check in the back branches
as long as we're enforcing it in HEAD.
I don't see any real reason to backport the check, but enforce it in HEAD going
forward. The risk of introducing an exit in backbranches when enforced against
in HEAD seem pretty manageable.
--
Daniel Gustafsson https://vmware.com/
On Sun, 2021-06-27 at 10:43 +0900, Michael Paquier wrote:
On Sat, Jun 26, 2021 at 01:43:50PM -0400, Tom Lane wrote:
BTW, so far as the original topic of this thread is concerned,
it sounds like Jacob's ultimate goal is to put some functionality
into libpq that requires JSON parsing. I'm going to say up front
that that sounds like a terrible idea. As we've just seen, libpq
operates under very tight constraints, not all of which are
mechanically enforced. I am really doubtful that anything that
would require a JSON parser has any business being in libpq.
Unless you can sell us on that point, I do not think it's worth
complicating the src/common JSON code to the point where it can
work under libpq's constraints.AFAIK, the SASL mechanism OAUTHBEARER described in RFC 7628 would
require such facilities as failures are reported in this format:
https://datatracker.ietf.org/doc/html/rfc7628
Right. So it really comes down to whether or not OAUTHBEARER support is
worth this additional complication, and that probably belongs to the
main thread on the topic.
But hey, we're getting some code cleanup out of the deal either way.
Perhaps you are right and we have no need to do any json parsing in
libpq as long as we pass down the JSON blob, but I am not completely
sure if we can avoid that either.
It is definitely an option.
With the current architecture of the proof-of-concept, I feel like
forcing every client to implement JSON parsing just to be able to use
OAUTHBEARER would be a significant barrier to entry. Again, that's
probably conversation for the main thread.
Separate topic: I find disturbing the dependency of jsonapi.c to
the logging parts just to cope with dummy error values that are
originally part of JsonParseErrorType.
I think we should clean this up regardless.
--Jacob
On Sat, 2021-06-26 at 09:36 +0900, Michael Paquier wrote:
On Fri, Jun 25, 2021 at 08:58:46PM +0000, Jacob Champion wrote:
On Thu, 2021-06-24 at 14:56 +0900, Michael Paquier wrote:
Looking more closely at that, I actually find a bit crazy the
requirement for any logging within jsonapi.c just to cope with the
fact that json_errdetail() and report_parse_error() just want to track
down if the caller is giving some garbage or not, which should never
be the case, really. So I would be tempted to eliminate this
dependency to begin with.I think that's a good plan.
We could do this cleanup first, as an independent patch. That's
simple enough. I am wondering if we'd better do this bit in 14
actually, so as the divergence between 15~ and 14 is lightly
minimized.
Up to you in the end; I don't have a good intuition for whether the
code motion would be worth it for 14, if it's not actively used.
The second thing is how we should try to handle the way the error
message gets allocated in json_errdetail(). libpq cannot rely on
psprintf(),That surprised me. So there's currently no compiler-enforced
prohibition, just a policy? It looks like the bar was lowered a little
bit in commit c0cb87fbb6, as libpq currently has a symbol dependency on
pg_get_line_buf() and pfree() on my machine.
This seems to have spawned an entirely new thread over the weekend,
which I will watch with interest. :)
If our libpq-specific implementation is going to end up returning NULL
on bad allocation anyway, could we just modify the behavior of the
existing front-end palloc implementation to not exit() from inside
libpq? That would save a lot of one-off code for future implementers.Yeah, a side effect of that is to enforce a new rule for any frontend
code that calls palloc(), and these could be easily exposed to crashes
within knowing about it until their system is under resource
pressure. Silent breakages with very old guaranteed behaviors is
bad.
Fair point.
What would you think about a src/port of asprintf()? Maybe libpq
doesn't change quickly enough to worry about it, but having developers
revisit stack allocation for strings every time they target the libpq
parts of the code seems like a recipe for security problems.
--Jacob
Jacob Champion <pchampion@vmware.com> writes:
What would you think about a src/port of asprintf()? Maybe libpq
doesn't change quickly enough to worry about it, but having developers
revisit stack allocation for strings every time they target the libpq
parts of the code seems like a recipe for security problems.
The existing convention is to use pqexpbuffer.c, which seems strictly
cleaner and more robust than asprintf. In particular its behavior under
OOM conditions is far easier/safer to work with. Maybe we should consider
moving that into src/common/ so that it can be used by code that's not
tightly bound into libpq?
regards, tom lane
On Tue, 2021-06-29 at 14:50 -0400, Tom Lane wrote:
Jacob Champion <pchampion@vmware.com> writes:
What would you think about a src/port of asprintf()? Maybe libpq
doesn't change quickly enough to worry about it, but having developers
revisit stack allocation for strings every time they target the libpq
parts of the code seems like a recipe for security problems.The existing convention is to use pqexpbuffer.c, which seems strictly
cleaner and more robust than asprintf. In particular its behavior under
OOM conditions is far easier/safer to work with. Maybe we should consider
moving that into src/common/ so that it can be used by code that's not
tightly bound into libpq?
I will take a look. Were you thinking we'd (hypothetically) migrate all
string allocation code under src/common to pqexpbuffer as part of that
move? Or just have it there to use as needed, when nm complains?
--Jacob
Jacob Champion <pchampion@vmware.com> writes:
On Tue, 2021-06-29 at 14:50 -0400, Tom Lane wrote:
The existing convention is to use pqexpbuffer.c, which seems strictly
cleaner and more robust than asprintf. In particular its behavior under
OOM conditions is far easier/safer to work with. Maybe we should consider
moving that into src/common/ so that it can be used by code that's not
tightly bound into libpq?
I will take a look. Were you thinking we'd (hypothetically) migrate all
string allocation code under src/common to pqexpbuffer as part of that
move? Or just have it there to use as needed, when nm complains?
Actually, I'd forgotten that the PQExpBuffer functions are already
exported by libpq, and much of our frontend code already uses them
from there. So we don't really need to move anything unless there's
a call to use this code in clients that don't use libpq, which are
a pretty small set.
Also, having them be available both from libpq.so and from libpgcommon.a
would be a tad problematic I think; it'd be hard to tell which way the
linker would choose to resolve that.
regards, tom lane
On 26.06.21 19:43, Tom Lane wrote:
I spent some time looking for other undesirable symbol dependencies
in libpq, and soon found a couple. PGTHREAD_ERROR potentially calls
abort(), which seems even worse than exit-on-OOM, although I don't
think we've ever heard a report of that being hit. Also,
fe-print.c's handling of OOM isn't nice at all:fprintf(stderr, libpq_gettext("out of memory\n"));
abort();Although fe-print.c is semi-deprecated, it still seems like it'd
be a good idea to clean that up.
These abort() calls were put there on purpose by:
commit c6ea8ccea6bf23501962ddc7ac9ffdb99c8643e1
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Mon Jan 30 20:34:00 2012
Use abort() instead of exit() to abort library functions
In some hopeless situations, certain library functions in libpq and
libpgport quit the program. Use abort() for that instead of exit(),
so we don't interfere with the normal exit codes the program might
use, we clearly signal the abnormal termination, and the caller has a
chance of catching the termination.
This was originally pointed out by Debian's Lintian program.
I don't object to refining this, but I think it is a mischaracterization
to calls this kind of code wrong. And I'm dubious about the backpatching.
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
On 26.06.21 19:43, Tom Lane wrote:
fe-print.c's handling of OOM isn't nice at all:
fprintf(stderr, libpq_gettext("out of memory\n"));
abort();
Although fe-print.c is semi-deprecated, it still seems like it'd
be a good idea to clean that up.
These abort() calls were put there on purpose by:
commit c6ea8ccea6bf23501962ddc7ac9ffdb99c8643e1
Use abort() instead of exit() to abort library functions
Well, the exit() calls that that replaced were quite inappropriate
too IMO. I don't think it boots much to argue about which way was
less bad; libpq has no business doing either thing.
What might be more useful is to consider whether there's a way
to retrofit an error-reporting convention onto these functions.
I thought about that for a bit, but concluded that the possible
interactions with stdio's error handling made that fairly tricky,
and it didn't seem worth messing with for such backwater code.
(Too bad POSIX didn't see fit to provide seterr(FILE*), or maybe
we could have reported OOM in fe-print that way.)
regards, tom lane
On Tue, Jun 29, 2021 at 03:34:29PM -0400, Tom Lane wrote:
Actually, I'd forgotten that the PQExpBuffer functions are already
exported by libpq, and much of our frontend code already uses them
from there. So we don't really need to move anything unless there's
a call to use this code in clients that don't use libpq, which are
a pretty small set.Also, having them be available both from libpq.so and from libpgcommon.a
would be a tad problematic I think; it'd be hard to tell which way the
linker would choose to resolve that.
Coming back to this thread. You were thinking about switching to
PQExpBuffer for the error strings generated depending on error code
for the frontend, right? Using a PQExpBuffer would be a problem if
attempting to get a more detailed error for pg_verifybackup, though I
guess that we can continue to live in this tool without this much
amount of details in the error strings.
It seems to me that this does not address yet the problems with the
palloc/pstrdup in jsonapi.c though, which would need to rely on
malloc() if we finish to use this code in libpq. I am not sure yet
that we have any need to do that yet as we may finish by not using
OAUTH as SASL mechanism at the end in core. So perhaps it would be
better to just give up on this thread for now?
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
It seems to me that this does not address yet the problems with the
palloc/pstrdup in jsonapi.c though, which would need to rely on
malloc() if we finish to use this code in libpq. I am not sure yet
that we have any need to do that yet as we may finish by not using
OAUTH as SASL mechanism at the end in core. So perhaps it would be
better to just give up on this thread for now?
Yeah, I think there's nothing to do here unless we decide that we
have to have JSON-parsing ability inside libpq ... which is a
situation I think we should try hard to avoid.
regards, tom lane
On Wed, 2021-07-07 at 01:42 -0400, Tom Lane wrote:
Michael Paquier <michael@paquier.xyz> writes:
It seems to me that this does not address yet the problems with the
palloc/pstrdup in jsonapi.c though, which would need to rely on
malloc() if we finish to use this code in libpq. I am not sure yet
that we have any need to do that yet as we may finish by not using
OAUTH as SASL mechanism at the end in core. So perhaps it would be
better to just give up on this thread for now?Yeah, I think there's nothing to do here unless we decide that we
have to have JSON-parsing ability inside libpq ... which is a
situation I think we should try hard to avoid.
I'm working on a corrected version of the allocation for the OAuth
proof of concept, so we can see what it might look like there. I will
withdraw this one from the commitfest. Thanks for all the feedback!
--Jacob