Additional comments around need_escapes in pg_parse_json()
I recently had occasion to use the pg_parse_json() function for creating
input functions for pg_ndistinct and pg_dependencies.
While it is obvious now that I should have been parsing with
need_escapes=true, it wasn't obvious from the outset, and so I'm proposing
adding a few comments to help the next person come to that conclusion
sooner than I did.
Attachments:
v1-0001-Explain-impact-of-need_escapes-in-JSON-parsing.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Explain-impact-of-need_escapes-in-JSON-parsing.patchDownload
From 84a8990622a4b83729600d9db43e8678502548f5 Mon Sep 17 00:00:00 2001
From: Corey Huinker <corey.huinker@gmail.com>
Date: Fri, 20 Dec 2024 05:08:50 -0500
Subject: [PATCH v1] Explain impact of need_escapes in JSON parsing.
Adding some additional comments to help the reader discover the
purpose and effect of the need_escapes parameter.
Nearly every new use of pg_parse_json and pg_parse_json_incremental will
need need_escapes=true.
---
src/include/common/jsonapi.h | 11 +++++++++++
src/common/jsonapi.c | 7 ++++++-
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/src/include/common/jsonapi.h b/src/include/common/jsonapi.h
index 167615a557..fa19005822 100644
--- a/src/include/common/jsonapi.h
+++ b/src/include/common/jsonapi.h
@@ -118,6 +118,11 @@ typedef struct JsonLexContext
struct jsonapi_StrValType *errormsg;
} JsonLexContext;
+/*
+ * Function types for custom json parsing actions.
+ *
+ * fname and token will always be NULL if the context has need_escapes=false.
+ */
typedef JsonParseErrorType (*json_struct_action) (void *state);
typedef JsonParseErrorType (*json_ofield_action) (void *state, char *fname, bool isnull);
typedef JsonParseErrorType (*json_aelem_action) (void *state, bool isnull);
@@ -197,6 +202,12 @@ extern JsonParseErrorType json_count_array_elements(JsonLexContext *lex,
* struct is allocated.
*
* If need_escapes is true, ->strval stores the unescaped lexemes.
+ *
+ * Setting need_escapes to true is necessary if the operation needs
+ * to reference field names or scalar values. This is true of any
+ * operation beyond purely checking the json-validaity of the source
+ * document.
+ *
* Unescaping is expensive, so only request it when necessary.
*
* If need_escapes is true or lex was given as NULL, then the caller is
diff --git a/src/common/jsonapi.c b/src/common/jsonapi.c
index 0e2a82ad7a..307182e478 100644
--- a/src/common/jsonapi.c
+++ b/src/common/jsonapi.c
@@ -1297,7 +1297,10 @@ parse_scalar(JsonLexContext *lex, const JsonSemAction *sem)
return result;
}
- /* invoke the callback, which may take ownership of val */
+ /*
+ * invoke the callback, which may take ownership of val.
+ * val always NULL when need_escapes=false
+ */
result = (*sfunc) (sem->semstate, val, tok);
if (lex->flags & JSONLEX_CTX_OWNS_TOKENS)
@@ -1349,6 +1352,7 @@ parse_object_field(JsonLexContext *lex, const JsonSemAction *sem)
if (ostart != NULL)
{
+ /* fname always NULL when need_escapes=false */
result = (*ostart) (sem->semstate, fname, isnull);
if (result != JSON_SUCCESS)
goto ofield_cleanup;
@@ -1370,6 +1374,7 @@ parse_object_field(JsonLexContext *lex, const JsonSemAction *sem)
if (oend != NULL)
{
+ /* fname always NULL when need_escapes=false */
result = (*oend) (sem->semstate, fname, isnull);
if (result != JSON_SUCCESS)
goto ofield_cleanup;
--
2.47.1
On 2024-12-20 Fr 5:21 AM, Corey Huinker wrote:
I recently had occasion to use the pg_parse_json() function for
creating input functions for pg_ndistinct and pg_dependencies.While it is obvious now that I should have been parsing with
need_escapes=true, it wasn't obvious from the outset, and so I'm
proposing adding a few comments to help the next person come to that
conclusion sooner than I did.
Comments are not quite accurate, but I'll look at cleaning them up and
adding them.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Comments are not quite accurate, but I'll look at cleaning them up and
adding them.
Much appreciated. Also, I don't think we need all of the comments, but I
figured it was easier for me to comment at every possible good place, and
let you pick your favorites.
On 2024-12-23 Mo 2:15 AM, Corey Huinker wrote:
Comments are not quite accurate, but I'll look at cleaning them up
and
adding them.Much appreciated. Also, I don't think we need all of the comments, but
I figured it was easier for me to comment at every possible good
place, and let you pick your favorites.
done.
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com