[PATCH] WIP: replace method for jsonpath

Started by Florents Tselaiover 1 year ago9 messages
#1Florents Tselai
florents.tselai@gmail.com
1 attachment(s)

Hi,

When working with jsonb/jsonpath, I’ve always wanted more fluent string operations for data cleaning.
Ideally, chaining text processing methods together.

Here’s a draft patch that adds a string replace method.

It works like this
select jsonb_path_query('"hello world"', '$.replace("hello","bye")');
jsonb_path_query
------------------
"bye world"
(1 row)
And looks like plays nicely with existing facilities

select jsonb_path_query('"hello world"', '$.replace("hello","bye") starts with "bye"');
jsonb_path_query
------------------
true
(1 row)
I’d like some initial feedback on whether this is of interested before I proceed with the following:
- I’ve tried respecting the surrounding code, but a more experienced eye can spot some inconsistencies. I’ll revisit those
- Add more test cases (I’ve added the obvious ones, but ideas on more cases are welcome).
- pg upgrades currently fail on CI (see <https://github.com/Florents-Tselai/postgres/pull/3/checks?check_run_id=30154989488&gt;)
- better error handling/reporting: I’ve kept the wording of error messages, but we’ll need something akin to ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION.
- documentation.

Attachments:

v1-0001-jsonpath-replace-method.patchapplication/octet-stream; name=v1-0001-jsonpath-replace-method.patch; x-unix-mode=0644Download
From 6f59b763a37703be52560692c91bb16c9545a4cd Mon Sep 17 00:00:00 2001
From: Florents Tselai <florents.tselai@gmail.com>
Date: Sun, 15 Sep 2024 04:02:19 +0300
Subject: [PATCH v1] jsonpath replace function

---
 src/backend/utils/adt/jsonpath.c             | 21 ++++++
 src/backend/utils/adt/jsonpath_exec.c        | 73 ++++++++++++++++++++
 src/backend/utils/adt/jsonpath_gram.y        | 24 ++++++-
 src/backend/utils/adt/jsonpath_scan.l        |  1 +
 src/include/utils/jsonpath.h                 |  1 +
 src/test/regress/expected/jsonb_jsonpath.out | 39 +++++++++++
 src/test/regress/expected/jsonpath.out       |  6 ++
 src/test/regress/sql/jsonb_jsonpath.sql      | 11 +++
 src/test/regress/sql/jsonpath.sql            |  1 +
 9 files changed, 174 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/jsonpath.c b/src/backend/utils/adt/jsonpath.c
index 0f691bc5f0..3daf2e2d81 100644
--- a/src/backend/utils/adt/jsonpath.c
+++ b/src/backend/utils/adt/jsonpath.c
@@ -298,6 +298,7 @@ flattenJsonPathParseItem(StringInfo buf, int *result, struct Node *escontext,
 		case jpiMod:
 		case jpiStartsWith:
 		case jpiDecimal:
+		case jpiReplaceFunc:
 			{
 				/*
 				 * First, reserve place for left/right arg's positions, then
@@ -831,6 +832,21 @@ printJsonPathItem(StringInfo buf, JsonPathItem *v, bool inKey,
 			}
 			appendStringInfoChar(buf, ')');
 			break;
+		case jpiReplaceFunc:
+			appendStringInfoString(buf, ".replace(");
+			if (v->content.args.left)
+			{
+				jspGetLeftArg(v, &elem);
+				printJsonPathItem(buf, &elem, false, false);
+			}
+			if (v->content.args.right)
+			{
+				appendStringInfoChar(buf, ',');
+				jspGetRightArg(v, &elem);
+				printJsonPathItem(buf, &elem, false, false);
+			}
+			appendStringInfoChar(buf, ')');
+			break;
 		default:
 			elog(ERROR, "unrecognized jsonpath item type: %d", v->type);
 	}
@@ -906,6 +922,8 @@ jspOperationName(JsonPathItemType type)
 			return "number";
 		case jpiStringFunc:
 			return "string";
+		case jpiReplaceFunc:
+			return "replace";
 		case jpiTime:
 			return "time";
 		case jpiTimeTz:
@@ -1041,6 +1059,7 @@ jspInitByBuffer(JsonPathItem *v, char *base, int32 pos)
 		case jpiMod:
 		case jpiStartsWith:
 		case jpiDecimal:
+		case jpiReplaceFunc:
 			read_int32(v->content.args.left, base, pos);
 			read_int32(v->content.args.right, base, pos);
 			break;
@@ -1149,6 +1168,7 @@ jspGetNext(JsonPathItem *v, JsonPathItem *a)
 			   v->type == jpiInteger ||
 			   v->type == jpiNumber ||
 			   v->type == jpiStringFunc ||
+			   v->type == jpiReplaceFunc ||
 			   v->type == jpiTime ||
 			   v->type == jpiTimeTz ||
 			   v->type == jpiTimestamp ||
@@ -1501,6 +1521,7 @@ jspIsMutableWalker(JsonPathItem *jpi, struct JsonPathMutableContext *cxt)
 			case jpiInteger:
 			case jpiNumber:
 			case jpiStringFunc:
+			case jpiReplaceFunc:
 				status = jpdsNonDateTime;
 				break;
 
diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c
index 1184cba983..3d23e44d34 100644
--- a/src/backend/utils/adt/jsonpath_exec.c
+++ b/src/backend/utils/adt/jsonpath_exec.c
@@ -1659,6 +1659,79 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
 				res = executeNextItem(cxt, jsp, NULL, jb, found, true);
 			}
 			break;
+		case jpiReplaceFunc:
+			{
+				JsonbValue	jbv;
+				Datum		replace_res;
+				char		*tmp = NULL;
+
+				/*
+				 * Value is not necessarily null-terminated, so we do
+				 * pnstrdup() here.
+				 */
+				tmp = pnstrdup(jb->val.string.val,
+							   jb->val.string.len);
+
+				if (unwrap && JsonbType(jb) == jbvArray)
+					return executeItemUnwrapTargetArray(cxt, jsp, jb, found,
+														false);
+
+				/* TODO: probably need ERRCODE for that? */
+				if (!(jb = getScalar(jb, jbvString)))
+				RETURN_ERROR(ereport(ERROR,
+									 (errcode(ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION),
+									  errmsg("jsonpath item method .%s() can only be applied to a string",
+											 jspOperationName(jsp->type)))));
+
+				if (jsp->content.args.left)
+				{
+					text		*from, *to;
+					char		*from_str, *to_str;
+					int			from_len, to_len;
+					ErrorSaveContext escontext = {T_ErrorSaveContext};
+
+					jspGetLeftArg(jsp, &elem);
+					if (elem.type != jpiString)
+						elog(ERROR, "invalid jsonpath item type for .replace() from");
+
+					from_str = jspGetString(&elem, &from_len);
+					from = cstring_to_text_with_len(from_str, from_len);
+
+					if (jsp->content.args.right)
+					{
+						jspGetRightArg(jsp, &elem);
+						if (elem.type != jpiString)
+							elog(ERROR, "invalid jsonpath item type for .replace() to");
+
+						to_str = jspGetString(&elem, &to_len);
+						to = cstring_to_text_with_len(to_str, to_len);
+
+						replace_res = DirectFunctionCall3Coll(replace_text,
+							C_COLLATION_OID,
+							CStringGetTextDatum(tmp),
+							PointerGetDatum(from),
+							PointerGetDatum(to));
+
+					}
+
+					res = jperOk;
+				}
+
+				if (res == jperNotFound)
+					RETURN_ERROR(ereport(ERROR,
+										 (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
+										  errmsg("jsonpath item method .%s() accepts two string arguments",
+												 jspOperationName(jsp->type)))));
+
+				jb = &jbv;
+				jb->type = jbvString;
+				jb->val.string.val = VARDATA_ANY(replace_res);
+				jb->val.string.len = VARSIZE_ANY_EXHDR(replace_res);
+
+				res = executeNextItem(cxt, jsp, NULL, jb, found, true);
+			}
+			break;
+
 
 		default:
 			elog(ERROR, "unrecognized jsonpath item type: %d", jsp->type);
diff --git a/src/backend/utils/adt/jsonpath_gram.y b/src/backend/utils/adt/jsonpath_gram.y
index 8733a0eac6..c33b73e98b 100644
--- a/src/backend/utils/adt/jsonpath_gram.y
+++ b/src/backend/utils/adt/jsonpath_gram.y
@@ -83,7 +83,7 @@ static bool makeItemLikeRegex(JsonPathParseItem *expr,
 %token	<str>		ABS_P SIZE_P TYPE_P FLOOR_P DOUBLE_P CEILING_P KEYVALUE_P
 %token	<str>		DATETIME_P
 %token	<str>		BIGINT_P BOOLEAN_P DATE_P DECIMAL_P INTEGER_P NUMBER_P
-%token	<str>		STRINGFUNC_P TIME_P TIME_TZ_P TIMESTAMP_P TIMESTAMP_TZ_P
+%token	<str>		STRINGFUNC_P TIME_P TIME_TZ_P TIMESTAMP_P TIMESTAMP_TZ_P REPLACEFUNC_P
 
 %type	<result>	result
 
@@ -91,9 +91,9 @@ static bool makeItemLikeRegex(JsonPathParseItem *expr,
 					any_path accessor_op key predicate delimited_predicate
 					index_elem starts_with_initial expr_or_predicate
 					datetime_template opt_datetime_template csv_elem
-					datetime_precision opt_datetime_precision
+					datetime_precision opt_datetime_precision replace_arg_elem
 
-%type	<elems>		accessor_expr csv_list opt_csv_list
+%type	<elems>		accessor_expr csv_list opt_csv_list replace_arg_list
 
 %type	<indexs>	index_list
 
@@ -266,6 +266,16 @@ accessor_op:
 						 errmsg("invalid input syntax for type %s", "jsonpath"),
 						 errdetail(".decimal() can only have an optional precision[,scale].")));
 		}
+	| '.' REPLACEFUNC_P '(' replace_arg_list ')'
+		{
+			if (list_length($4) == 2)
+				$$ = makeItemBinary(jpiReplaceFunc, linitial($4), lsecond($4));
+			else
+				ereturn(escontext, false,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("invalid input syntax for type %s", "jsonpath"),
+						 errdetail(".replace() accepts two arguments.")));
+		}
 	| '.' DATETIME_P '(' opt_datetime_template ')'
 		{ $$ = makeItemUnary(jpiDatetime, $4); }
 	| '.' TIME_P '(' opt_datetime_precision ')'
@@ -315,6 +325,14 @@ opt_datetime_template:
 	| /* EMPTY */					{ $$ = NULL; }
 	;
 
+replace_arg_elem:
+	STRING_P						{ $$ = makeItemString(&$1); }
+	;
+
+replace_arg_list:
+	replace_arg_elem							{ $$ = list_make1($1); }
+	| replace_arg_list ',' replace_arg_elem	{ $$ = lappend($1, $3); }
+	;
 key:
 	key_name						{ $$ = makeItemKey(&$1); }
 	;
diff --git a/src/backend/utils/adt/jsonpath_scan.l b/src/backend/utils/adt/jsonpath_scan.l
index 7acda77837..890522b323 100644
--- a/src/backend/utils/adt/jsonpath_scan.l
+++ b/src/backend/utils/adt/jsonpath_scan.l
@@ -423,6 +423,7 @@ static const JsonPathKeyword keywords[] = {
 	{ 7, false,	CEILING_P,	"ceiling"},
 	{ 7, false,	DECIMAL_P,	"decimal"},
 	{ 7, false,	INTEGER_P,	"integer"},
+	{ 7, false, REPLACEFUNC_P, "replace"},
 	{ 7, false,	TIME_TZ_P,	"time_tz"},
 	{ 7, false,	UNKNOWN_P,	"unknown"},
 	{ 8, false,	DATETIME_P,	"datetime"},
diff --git a/src/include/utils/jsonpath.h b/src/include/utils/jsonpath.h
index ee35698d08..cbb35f02d3 100644
--- a/src/include/utils/jsonpath.h
+++ b/src/include/utils/jsonpath.h
@@ -115,6 +115,7 @@ typedef enum JsonPathItemType
 	jpiTimeTz,					/* .time_tz() item method */
 	jpiTimestamp,				/* .timestamp() item method */
 	jpiTimestampTz,				/* .timestamp_tz() item method */
+	jpiReplaceFunc,				/* .replace() item method */
 } JsonPathItemType;
 
 /* XQuery regex mode flags for LIKE_REGEX predicate */
diff --git a/src/test/regress/expected/jsonb_jsonpath.out b/src/test/regress/expected/jsonb_jsonpath.out
index acdf7e436f..2a1b2afff0 100644
--- a/src/test/regress/expected/jsonb_jsonpath.out
+++ b/src/test/regress/expected/jsonb_jsonpath.out
@@ -2719,6 +2719,45 @@ select jsonb_path_query('"2023-08-15 12:34:56"', '$.timestamp().string()');
 (1 row)
 
 rollback;
+-- Test .replace()
+select jsonb_path_query('null', '$.replace("x", "bye")');
+ERROR:  jsonpath item method .replace() can only be applied to a string
+select jsonb_path_query('null', '$.replace("x", "bye")', silent => true);
+ jsonb_path_query 
+------------------
+(0 rows)
+
+select jsonb_path_query('[]', '$.replace("x", "bye")');
+ jsonb_path_query 
+------------------
+(0 rows)
+
+select jsonb_path_query('{}', '$.replace("x", "bye")');
+ERROR:  jsonpath item method .replace() can only be applied to a string
+select jsonb_path_query('[]', 'strict $.replace("x", "bye")', silent => true);
+ jsonb_path_query 
+------------------
+(0 rows)
+
+select jsonb_path_query('{}', '$.replace("x", "bye")', silent => true);
+ jsonb_path_query 
+------------------
+(0 rows)
+
+select jsonb_path_query('1.23', '$.replace("x", "bye")');
+ERROR:  jsonpath item method .replace() can only be applied to a string
+select jsonb_path_query('"hello world"', '$.replace("hello","bye")');
+ jsonb_path_query 
+------------------
+ "bye world"
+(1 row)
+
+select jsonb_path_query('"hello world"', '$.replace("hello","bye") starts with "bye"');
+ jsonb_path_query 
+------------------
+ true
+(1 row)
+
 -- Test .time()
 select jsonb_path_query('null', '$.time()');
 ERROR:  jsonpath item method .time() can only be applied to a string
diff --git a/src/test/regress/expected/jsonpath.out b/src/test/regress/expected/jsonpath.out
index fd9bd755f5..871b5e76c7 100644
--- a/src/test/regress/expected/jsonpath.out
+++ b/src/test/regress/expected/jsonpath.out
@@ -435,6 +435,12 @@ select '$.string()'::jsonpath;
  $.string()
 (1 row)
 
+select '$.replace("hello","bye")'::jsonpath;
+         jsonpath         
+--------------------------
+ $.replace("hello","bye")
+(1 row)
+
 select '$.time()'::jsonpath;
  jsonpath 
 ----------
diff --git a/src/test/regress/sql/jsonb_jsonpath.sql b/src/test/regress/sql/jsonb_jsonpath.sql
index da3f7969ca..c1b8491387 100644
--- a/src/test/regress/sql/jsonb_jsonpath.sql
+++ b/src/test/regress/sql/jsonb_jsonpath.sql
@@ -619,6 +619,17 @@ select jsonb_path_query('"2023-08-15 12:34:56 +5:30"', '$.timestamp_tz().string(
 select jsonb_path_query('"2023-08-15 12:34:56"', '$.timestamp().string()');
 rollback;
 
+-- Test .replace()
+select jsonb_path_query('null', '$.replace("x", "bye")');
+select jsonb_path_query('null', '$.replace("x", "bye")', silent => true);
+select jsonb_path_query('[]', '$.replace("x", "bye")');
+select jsonb_path_query('{}', '$.replace("x", "bye")');
+select jsonb_path_query('[]', 'strict $.replace("x", "bye")', silent => true);
+select jsonb_path_query('{}', '$.replace("x", "bye")', silent => true);
+select jsonb_path_query('1.23', '$.replace("x", "bye")');
+select jsonb_path_query('"hello world"', '$.replace("hello","bye")');
+select jsonb_path_query('"hello world"', '$.replace("hello","bye") starts with "bye"');
+
 -- Test .time()
 select jsonb_path_query('null', '$.time()');
 select jsonb_path_query('true', '$.time()');
diff --git a/src/test/regress/sql/jsonpath.sql b/src/test/regress/sql/jsonpath.sql
index 61a5270d4e..81704303bd 100644
--- a/src/test/regress/sql/jsonpath.sql
+++ b/src/test/regress/sql/jsonpath.sql
@@ -78,6 +78,7 @@ select '$.boolean()'::jsonpath;
 select '$.date()'::jsonpath;
 select '$.decimal(4,2)'::jsonpath;
 select '$.string()'::jsonpath;
+select '$.replace("hello","bye")'::jsonpath;
 select '$.time()'::jsonpath;
 select '$.time(6)'::jsonpath;
 select '$.time_tz()'::jsonpath;
-- 
2.39.3 (Apple Git-146)

#2Florents Tselai
florents.tselai@gmail.com
In reply to: Florents Tselai (#1)
3 attachment(s)
Re: [PATCH] WIP: replace method for jsonpath

Here’s an updated version of this patch.
The previous version failed several CI steps; this passes them all.

Unless someone disagrees,
I’ll proceed with the documentation and add this for the next CF.

As a future note:
It’s worth noting that both this newly added jspItem and other ones like (jpiDecimal, jpiString)
use jspGetRightArg and jspGetLeftArg.
left and right can be confusing if more complex methods are added in the future.
i.e. jsonpath methods with nargs>=3 .
I was wondering if we’d like something like JSP_GETARG(n)

GitHub PR in case you prefer it’s UI https://github.com/Florents-Tselai/postgres/pull/3



Show quoted text

On 15 Sep 2024, at 4:15 AM, Florents Tselai <florents.tselai@gmail.com> wrote:

Hi,

When working with jsonb/jsonpath, I’ve always wanted more fluent string operations for data cleaning.
Ideally, chaining text processing methods together.

Here’s a draft patch that adds a string replace method.

It works like this
select jsonb_path_query('"hello world"', '$.replace("hello","bye")');
jsonb_path_query
------------------
"bye world"
(1 row)
And looks like plays nicely with existing facilities

select jsonb_path_query('"hello world"', '$.replace("hello","bye") starts with "bye"');
jsonb_path_query
------------------
true
(1 row)
I’d like some initial feedback on whether this is of interested before I proceed with the following:
- I’ve tried respecting the surrounding code, but a more experienced eye can spot some inconsistencies. I’ll revisit those
- Add more test cases (I’ve added the obvious ones, but ideas on more cases are welcome).
- pg upgrades currently fail on CI (see <https://github.com/Florents-Tselai/postgres/pull/3/checks?check_run_id=30154989488&gt;)
- better error handling/reporting: I’ve kept the wording of error messages, but we’ll need something akin to ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION.
- documentation.

<v1-0001-jsonpath-replace-method.patch>

Attachments:

v2-0001-Add-first-working-version-for-jsonpath-.replace-s.patchapplication/octet-stream; name=v2-0001-Add-first-working-version-for-jsonpath-.replace-s.patch; x-unix-mode=0644Download
From 6f59b763a37703be52560692c91bb16c9545a4cd Mon Sep 17 00:00:00 2001
From: Florents Tselai <florents.tselai@gmail.com>
Date: Sun, 15 Sep 2024 04:02:19 +0300
Subject: [PATCH v2 1/3] Add first working version for jsonpath
 $.replace(string, string) function.

---
 src/backend/utils/adt/jsonpath.c             | 21 ++++++
 src/backend/utils/adt/jsonpath_exec.c        | 73 ++++++++++++++++++++
 src/backend/utils/adt/jsonpath_gram.y        | 24 ++++++-
 src/backend/utils/adt/jsonpath_scan.l        |  1 +
 src/include/utils/jsonpath.h                 |  1 +
 src/test/regress/expected/jsonb_jsonpath.out | 39 +++++++++++
 src/test/regress/expected/jsonpath.out       |  6 ++
 src/test/regress/sql/jsonb_jsonpath.sql      | 11 +++
 src/test/regress/sql/jsonpath.sql            |  1 +
 9 files changed, 174 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/jsonpath.c b/src/backend/utils/adt/jsonpath.c
index 0f691bc5f0..3daf2e2d81 100644
--- a/src/backend/utils/adt/jsonpath.c
+++ b/src/backend/utils/adt/jsonpath.c
@@ -298,6 +298,7 @@ flattenJsonPathParseItem(StringInfo buf, int *result, struct Node *escontext,
 		case jpiMod:
 		case jpiStartsWith:
 		case jpiDecimal:
+		case jpiReplaceFunc:
 			{
 				/*
 				 * First, reserve place for left/right arg's positions, then
@@ -831,6 +832,21 @@ printJsonPathItem(StringInfo buf, JsonPathItem *v, bool inKey,
 			}
 			appendStringInfoChar(buf, ')');
 			break;
+		case jpiReplaceFunc:
+			appendStringInfoString(buf, ".replace(");
+			if (v->content.args.left)
+			{
+				jspGetLeftArg(v, &elem);
+				printJsonPathItem(buf, &elem, false, false);
+			}
+			if (v->content.args.right)
+			{
+				appendStringInfoChar(buf, ',');
+				jspGetRightArg(v, &elem);
+				printJsonPathItem(buf, &elem, false, false);
+			}
+			appendStringInfoChar(buf, ')');
+			break;
 		default:
 			elog(ERROR, "unrecognized jsonpath item type: %d", v->type);
 	}
@@ -906,6 +922,8 @@ jspOperationName(JsonPathItemType type)
 			return "number";
 		case jpiStringFunc:
 			return "string";
+		case jpiReplaceFunc:
+			return "replace";
 		case jpiTime:
 			return "time";
 		case jpiTimeTz:
@@ -1041,6 +1059,7 @@ jspInitByBuffer(JsonPathItem *v, char *base, int32 pos)
 		case jpiMod:
 		case jpiStartsWith:
 		case jpiDecimal:
+		case jpiReplaceFunc:
 			read_int32(v->content.args.left, base, pos);
 			read_int32(v->content.args.right, base, pos);
 			break;
@@ -1149,6 +1168,7 @@ jspGetNext(JsonPathItem *v, JsonPathItem *a)
 			   v->type == jpiInteger ||
 			   v->type == jpiNumber ||
 			   v->type == jpiStringFunc ||
+			   v->type == jpiReplaceFunc ||
 			   v->type == jpiTime ||
 			   v->type == jpiTimeTz ||
 			   v->type == jpiTimestamp ||
@@ -1501,6 +1521,7 @@ jspIsMutableWalker(JsonPathItem *jpi, struct JsonPathMutableContext *cxt)
 			case jpiInteger:
 			case jpiNumber:
 			case jpiStringFunc:
+			case jpiReplaceFunc:
 				status = jpdsNonDateTime;
 				break;
 
diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c
index 1184cba983..3d23e44d34 100644
--- a/src/backend/utils/adt/jsonpath_exec.c
+++ b/src/backend/utils/adt/jsonpath_exec.c
@@ -1659,6 +1659,79 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
 				res = executeNextItem(cxt, jsp, NULL, jb, found, true);
 			}
 			break;
+		case jpiReplaceFunc:
+			{
+				JsonbValue	jbv;
+				Datum		replace_res;
+				char		*tmp = NULL;
+
+				/*
+				 * Value is not necessarily null-terminated, so we do
+				 * pnstrdup() here.
+				 */
+				tmp = pnstrdup(jb->val.string.val,
+							   jb->val.string.len);
+
+				if (unwrap && JsonbType(jb) == jbvArray)
+					return executeItemUnwrapTargetArray(cxt, jsp, jb, found,
+														false);
+
+				/* TODO: probably need ERRCODE for that? */
+				if (!(jb = getScalar(jb, jbvString)))
+				RETURN_ERROR(ereport(ERROR,
+									 (errcode(ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION),
+									  errmsg("jsonpath item method .%s() can only be applied to a string",
+											 jspOperationName(jsp->type)))));
+
+				if (jsp->content.args.left)
+				{
+					text		*from, *to;
+					char		*from_str, *to_str;
+					int			from_len, to_len;
+					ErrorSaveContext escontext = {T_ErrorSaveContext};
+
+					jspGetLeftArg(jsp, &elem);
+					if (elem.type != jpiString)
+						elog(ERROR, "invalid jsonpath item type for .replace() from");
+
+					from_str = jspGetString(&elem, &from_len);
+					from = cstring_to_text_with_len(from_str, from_len);
+
+					if (jsp->content.args.right)
+					{
+						jspGetRightArg(jsp, &elem);
+						if (elem.type != jpiString)
+							elog(ERROR, "invalid jsonpath item type for .replace() to");
+
+						to_str = jspGetString(&elem, &to_len);
+						to = cstring_to_text_with_len(to_str, to_len);
+
+						replace_res = DirectFunctionCall3Coll(replace_text,
+							C_COLLATION_OID,
+							CStringGetTextDatum(tmp),
+							PointerGetDatum(from),
+							PointerGetDatum(to));
+
+					}
+
+					res = jperOk;
+				}
+
+				if (res == jperNotFound)
+					RETURN_ERROR(ereport(ERROR,
+										 (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
+										  errmsg("jsonpath item method .%s() accepts two string arguments",
+												 jspOperationName(jsp->type)))));
+
+				jb = &jbv;
+				jb->type = jbvString;
+				jb->val.string.val = VARDATA_ANY(replace_res);
+				jb->val.string.len = VARSIZE_ANY_EXHDR(replace_res);
+
+				res = executeNextItem(cxt, jsp, NULL, jb, found, true);
+			}
+			break;
+
 
 		default:
 			elog(ERROR, "unrecognized jsonpath item type: %d", jsp->type);
diff --git a/src/backend/utils/adt/jsonpath_gram.y b/src/backend/utils/adt/jsonpath_gram.y
index 8733a0eac6..c33b73e98b 100644
--- a/src/backend/utils/adt/jsonpath_gram.y
+++ b/src/backend/utils/adt/jsonpath_gram.y
@@ -83,7 +83,7 @@ static bool makeItemLikeRegex(JsonPathParseItem *expr,
 %token	<str>		ABS_P SIZE_P TYPE_P FLOOR_P DOUBLE_P CEILING_P KEYVALUE_P
 %token	<str>		DATETIME_P
 %token	<str>		BIGINT_P BOOLEAN_P DATE_P DECIMAL_P INTEGER_P NUMBER_P
-%token	<str>		STRINGFUNC_P TIME_P TIME_TZ_P TIMESTAMP_P TIMESTAMP_TZ_P
+%token	<str>		STRINGFUNC_P TIME_P TIME_TZ_P TIMESTAMP_P TIMESTAMP_TZ_P REPLACEFUNC_P
 
 %type	<result>	result
 
@@ -91,9 +91,9 @@ static bool makeItemLikeRegex(JsonPathParseItem *expr,
 					any_path accessor_op key predicate delimited_predicate
 					index_elem starts_with_initial expr_or_predicate
 					datetime_template opt_datetime_template csv_elem
-					datetime_precision opt_datetime_precision
+					datetime_precision opt_datetime_precision replace_arg_elem
 
-%type	<elems>		accessor_expr csv_list opt_csv_list
+%type	<elems>		accessor_expr csv_list opt_csv_list replace_arg_list
 
 %type	<indexs>	index_list
 
@@ -266,6 +266,16 @@ accessor_op:
 						 errmsg("invalid input syntax for type %s", "jsonpath"),
 						 errdetail(".decimal() can only have an optional precision[,scale].")));
 		}
+	| '.' REPLACEFUNC_P '(' replace_arg_list ')'
+		{
+			if (list_length($4) == 2)
+				$$ = makeItemBinary(jpiReplaceFunc, linitial($4), lsecond($4));
+			else
+				ereturn(escontext, false,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("invalid input syntax for type %s", "jsonpath"),
+						 errdetail(".replace() accepts two arguments.")));
+		}
 	| '.' DATETIME_P '(' opt_datetime_template ')'
 		{ $$ = makeItemUnary(jpiDatetime, $4); }
 	| '.' TIME_P '(' opt_datetime_precision ')'
@@ -315,6 +325,14 @@ opt_datetime_template:
 	| /* EMPTY */					{ $$ = NULL; }
 	;
 
+replace_arg_elem:
+	STRING_P						{ $$ = makeItemString(&$1); }
+	;
+
+replace_arg_list:
+	replace_arg_elem							{ $$ = list_make1($1); }
+	| replace_arg_list ',' replace_arg_elem	{ $$ = lappend($1, $3); }
+	;
 key:
 	key_name						{ $$ = makeItemKey(&$1); }
 	;
diff --git a/src/backend/utils/adt/jsonpath_scan.l b/src/backend/utils/adt/jsonpath_scan.l
index 7acda77837..890522b323 100644
--- a/src/backend/utils/adt/jsonpath_scan.l
+++ b/src/backend/utils/adt/jsonpath_scan.l
@@ -423,6 +423,7 @@ static const JsonPathKeyword keywords[] = {
 	{ 7, false,	CEILING_P,	"ceiling"},
 	{ 7, false,	DECIMAL_P,	"decimal"},
 	{ 7, false,	INTEGER_P,	"integer"},
+	{ 7, false, REPLACEFUNC_P, "replace"},
 	{ 7, false,	TIME_TZ_P,	"time_tz"},
 	{ 7, false,	UNKNOWN_P,	"unknown"},
 	{ 8, false,	DATETIME_P,	"datetime"},
diff --git a/src/include/utils/jsonpath.h b/src/include/utils/jsonpath.h
index ee35698d08..cbb35f02d3 100644
--- a/src/include/utils/jsonpath.h
+++ b/src/include/utils/jsonpath.h
@@ -115,6 +115,7 @@ typedef enum JsonPathItemType
 	jpiTimeTz,					/* .time_tz() item method */
 	jpiTimestamp,				/* .timestamp() item method */
 	jpiTimestampTz,				/* .timestamp_tz() item method */
+	jpiReplaceFunc,				/* .replace() item method */
 } JsonPathItemType;
 
 /* XQuery regex mode flags for LIKE_REGEX predicate */
diff --git a/src/test/regress/expected/jsonb_jsonpath.out b/src/test/regress/expected/jsonb_jsonpath.out
index acdf7e436f..2a1b2afff0 100644
--- a/src/test/regress/expected/jsonb_jsonpath.out
+++ b/src/test/regress/expected/jsonb_jsonpath.out
@@ -2719,6 +2719,45 @@ select jsonb_path_query('"2023-08-15 12:34:56"', '$.timestamp().string()');
 (1 row)
 
 rollback;
+-- Test .replace()
+select jsonb_path_query('null', '$.replace("x", "bye")');
+ERROR:  jsonpath item method .replace() can only be applied to a string
+select jsonb_path_query('null', '$.replace("x", "bye")', silent => true);
+ jsonb_path_query 
+------------------
+(0 rows)
+
+select jsonb_path_query('[]', '$.replace("x", "bye")');
+ jsonb_path_query 
+------------------
+(0 rows)
+
+select jsonb_path_query('{}', '$.replace("x", "bye")');
+ERROR:  jsonpath item method .replace() can only be applied to a string
+select jsonb_path_query('[]', 'strict $.replace("x", "bye")', silent => true);
+ jsonb_path_query 
+------------------
+(0 rows)
+
+select jsonb_path_query('{}', '$.replace("x", "bye")', silent => true);
+ jsonb_path_query 
+------------------
+(0 rows)
+
+select jsonb_path_query('1.23', '$.replace("x", "bye")');
+ERROR:  jsonpath item method .replace() can only be applied to a string
+select jsonb_path_query('"hello world"', '$.replace("hello","bye")');
+ jsonb_path_query 
+------------------
+ "bye world"
+(1 row)
+
+select jsonb_path_query('"hello world"', '$.replace("hello","bye") starts with "bye"');
+ jsonb_path_query 
+------------------
+ true
+(1 row)
+
 -- Test .time()
 select jsonb_path_query('null', '$.time()');
 ERROR:  jsonpath item method .time() can only be applied to a string
diff --git a/src/test/regress/expected/jsonpath.out b/src/test/regress/expected/jsonpath.out
index fd9bd755f5..871b5e76c7 100644
--- a/src/test/regress/expected/jsonpath.out
+++ b/src/test/regress/expected/jsonpath.out
@@ -435,6 +435,12 @@ select '$.string()'::jsonpath;
  $.string()
 (1 row)
 
+select '$.replace("hello","bye")'::jsonpath;
+         jsonpath         
+--------------------------
+ $.replace("hello","bye")
+(1 row)
+
 select '$.time()'::jsonpath;
  jsonpath 
 ----------
diff --git a/src/test/regress/sql/jsonb_jsonpath.sql b/src/test/regress/sql/jsonb_jsonpath.sql
index da3f7969ca..c1b8491387 100644
--- a/src/test/regress/sql/jsonb_jsonpath.sql
+++ b/src/test/regress/sql/jsonb_jsonpath.sql
@@ -619,6 +619,17 @@ select jsonb_path_query('"2023-08-15 12:34:56 +5:30"', '$.timestamp_tz().string(
 select jsonb_path_query('"2023-08-15 12:34:56"', '$.timestamp().string()');
 rollback;
 
+-- Test .replace()
+select jsonb_path_query('null', '$.replace("x", "bye")');
+select jsonb_path_query('null', '$.replace("x", "bye")', silent => true);
+select jsonb_path_query('[]', '$.replace("x", "bye")');
+select jsonb_path_query('{}', '$.replace("x", "bye")');
+select jsonb_path_query('[]', 'strict $.replace("x", "bye")', silent => true);
+select jsonb_path_query('{}', '$.replace("x", "bye")', silent => true);
+select jsonb_path_query('1.23', '$.replace("x", "bye")');
+select jsonb_path_query('"hello world"', '$.replace("hello","bye")');
+select jsonb_path_query('"hello world"', '$.replace("hello","bye") starts with "bye"');
+
 -- Test .time()
 select jsonb_path_query('null', '$.time()');
 select jsonb_path_query('true', '$.time()');
diff --git a/src/test/regress/sql/jsonpath.sql b/src/test/regress/sql/jsonpath.sql
index 61a5270d4e..81704303bd 100644
--- a/src/test/regress/sql/jsonpath.sql
+++ b/src/test/regress/sql/jsonpath.sql
@@ -78,6 +78,7 @@ select '$.boolean()'::jsonpath;
 select '$.date()'::jsonpath;
 select '$.decimal(4,2)'::jsonpath;
 select '$.string()'::jsonpath;
+select '$.replace("hello","bye")'::jsonpath;
 select '$.time()'::jsonpath;
 select '$.time(6)'::jsonpath;
 select '$.time_tz()'::jsonpath;
-- 
2.39.3 (Apple Git-146)

v2-0002-replace-should-work-with-strings-inside-arrays.patchapplication/octet-stream; name=v2-0002-replace-should-work-with-strings-inside-arrays.patch; x-unix-mode=0644Download
From 6840ba1b6a463c9fc6882c458b1feb77dc139da6 Mon Sep 17 00:00:00 2001
From: Florents Tselai <florents.tselai@gmail.com>
Date: Tue, 17 Sep 2024 01:15:17 +0300
Subject: [PATCH v2 2/3] replace should work with strings inside arrays

---
 src/test/regress/expected/jsonb_jsonpath.out | 7 +++++--
 src/test/regress/sql/jsonb_jsonpath.sql      | 2 +-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/test/regress/expected/jsonb_jsonpath.out b/src/test/regress/expected/jsonb_jsonpath.out
index 2a1b2afff0..5c7da29102 100644
--- a/src/test/regress/expected/jsonb_jsonpath.out
+++ b/src/test/regress/expected/jsonb_jsonpath.out
@@ -2727,10 +2727,13 @@ select jsonb_path_query('null', '$.replace("x", "bye")', silent => true);
 ------------------
 (0 rows)
 
-select jsonb_path_query('[]', '$.replace("x", "bye")');
+select jsonb_path_query('["x", "y", "z"]', '$.replace("x", "bye")');
  jsonb_path_query 
 ------------------
-(0 rows)
+ "bye"
+ "y"
+ "z"
+(3 rows)
 
 select jsonb_path_query('{}', '$.replace("x", "bye")');
 ERROR:  jsonpath item method .replace() can only be applied to a string
diff --git a/src/test/regress/sql/jsonb_jsonpath.sql b/src/test/regress/sql/jsonb_jsonpath.sql
index c1b8491387..63c5087149 100644
--- a/src/test/regress/sql/jsonb_jsonpath.sql
+++ b/src/test/regress/sql/jsonb_jsonpath.sql
@@ -622,7 +622,7 @@ rollback;
 -- Test .replace()
 select jsonb_path_query('null', '$.replace("x", "bye")');
 select jsonb_path_query('null', '$.replace("x", "bye")', silent => true);
-select jsonb_path_query('[]', '$.replace("x", "bye")');
+select jsonb_path_query('["x", "y", "z"]', '$.replace("x", "bye")');
 select jsonb_path_query('{}', '$.replace("x", "bye")');
 select jsonb_path_query('[]', 'strict $.replace("x", "bye")', silent => true);
 select jsonb_path_query('{}', '$.replace("x", "bye")', silent => true);
-- 
2.39.3 (Apple Git-146)

v2-0003-Fix-logic-for-jpiReplaceFunc.patchapplication/octet-stream; name=v2-0003-Fix-logic-for-jpiReplaceFunc.patch; x-unix-mode=0644Download
From df9483dd22413a053584377bf4ec5f14a3ae28ab Mon Sep 17 00:00:00 2001
From: Florents Tselai <florents.tselai@gmail.com>
Date: Tue, 17 Sep 2024 01:17:59 +0300
Subject: [PATCH v2 3/3] Fix logic for jpiReplaceFunc. Also make it acceptable
 in jspGetLeftArg and jspGetRightArg.

---
 src/backend/utils/adt/jsonpath.c      |  6 +-
 src/backend/utils/adt/jsonpath_exec.c | 83 ++++++++++++---------------
 2 files changed, 41 insertions(+), 48 deletions(-)

diff --git a/src/backend/utils/adt/jsonpath.c b/src/backend/utils/adt/jsonpath.c
index 3daf2e2d81..3a5c7a00cf 100644
--- a/src/backend/utils/adt/jsonpath.c
+++ b/src/backend/utils/adt/jsonpath.c
@@ -1199,7 +1199,8 @@ jspGetLeftArg(JsonPathItem *v, JsonPathItem *a)
 		   v->type == jpiDiv ||
 		   v->type == jpiMod ||
 		   v->type == jpiStartsWith ||
-		   v->type == jpiDecimal);
+		   v->type == jpiDecimal ||
+		   v->type == jpiReplaceFunc);
 
 	jspInitByBuffer(a, v->base, v->content.args.left);
 }
@@ -1221,7 +1222,8 @@ jspGetRightArg(JsonPathItem *v, JsonPathItem *a)
 		   v->type == jpiDiv ||
 		   v->type == jpiMod ||
 		   v->type == jpiStartsWith ||
-		   v->type == jpiDecimal);
+		   v->type == jpiDecimal ||
+		   v->type == jpiReplaceFunc);
 
 	jspInitByBuffer(a, v->base, v->content.args.right);
 }
diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c
index 3d23e44d34..4f0fa3d6b7 100644
--- a/src/backend/utils/adt/jsonpath_exec.c
+++ b/src/backend/utils/adt/jsonpath_exec.c
@@ -1659,74 +1659,65 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
 				res = executeNextItem(cxt, jsp, NULL, jb, found, true);
 			}
 			break;
+
 		case jpiReplaceFunc:
 			{
 				JsonbValue	jbv;
-				Datum		replace_res;
-				char		*tmp = NULL;
-
-				/*
-				 * Value is not necessarily null-terminated, so we do
-				 * pnstrdup() here.
-				 */
-				tmp = pnstrdup(jb->val.string.val,
-							   jb->val.string.len);
+				char		*replacedTxt;
+				char		*txt = NULL;
 
 				if (unwrap && JsonbType(jb) == jbvArray)
-					return executeItemUnwrapTargetArray(cxt, jsp, jb, found,
-														false);
+					return executeItemUnwrapTargetArray(cxt, jsp, jb, found,false);
 
-				/* TODO: probably need ERRCODE for that? */
-				if (!(jb = getScalar(jb, jbvString)))
-				RETURN_ERROR(ereport(ERROR,
-									 (errcode(ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION),
-									  errmsg("jsonpath item method .%s() can only be applied to a string",
-											 jspOperationName(jsp->type)))));
+				if (jb->type == jbvString) {
+					/* Value is not necessarily null-terminated, so we do pnstrdup() here. */
+					txt = pnstrdup(jb->val.string.val, jb->val.string.len);
 
-				if (jsp->content.args.left)
+					res = jperOk;
+				}
+
+				if (res == jperNotFound) {
+					/* TODO: probably need ERRCODE for that? */
+					RETURN_ERROR(ereport(ERROR,
+										 (errcode(ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION),
+										  errmsg("jsonpath item method .%s() can only be applied to a string",
+												 jspOperationName(jsp->type)))));
+				}
+				if (jsp->content.args.left && jsp->content.args.right)
 				{
-					text		*from, *to;
 					char		*from_str, *to_str;
 					int			from_len, to_len;
-					ErrorSaveContext escontext = {T_ErrorSaveContext};
 
 					jspGetLeftArg(jsp, &elem);
 					if (elem.type != jpiString)
 						elog(ERROR, "invalid jsonpath item type for .replace() from");
 
 					from_str = jspGetString(&elem, &from_len);
-					from = cstring_to_text_with_len(from_str, from_len);
-
-					if (jsp->content.args.right)
-					{
-						jspGetRightArg(jsp, &elem);
-						if (elem.type != jpiString)
-							elog(ERROR, "invalid jsonpath item type for .replace() to");
 
-						to_str = jspGetString(&elem, &to_len);
-						to = cstring_to_text_with_len(to_str, to_len);
+					jspGetRightArg(jsp, &elem);
+					if (elem.type != jpiString)
+						elog(ERROR, "invalid jsonpath item type for .replace() to");
 
-						replace_res = DirectFunctionCall3Coll(replace_text,
-							C_COLLATION_OID,
-							CStringGetTextDatum(tmp),
-							PointerGetDatum(from),
-							PointerGetDatum(to));
+					to_str = jspGetString(&elem, &to_len);
 
-					}
+					replacedTxt = TextDatumGetCString(DirectFunctionCall3Coll(replace_text,
+						C_COLLATION_OID,
+						CStringGetTextDatum(txt),
+						CStringGetTextDatum(from_str),
+						CStringGetTextDatum(to_str)));
 
-					res = jperOk;
+					jb = &jbv;
+					jb->type = jbvString;
+					jb->val.string.val = replacedTxt;
+					jb->val.string.len = strlen(replacedTxt);
 				}
-
-				if (res == jperNotFound)
+				else {
 					RETURN_ERROR(ereport(ERROR,
-										 (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
-										  errmsg("jsonpath item method .%s() accepts two string arguments",
-												 jspOperationName(jsp->type)))));
-
-				jb = &jbv;
-				jb->type = jbvString;
-				jb->val.string.val = VARDATA_ANY(replace_res);
-				jb->val.string.len = VARSIZE_ANY_EXHDR(replace_res);
+											 (errcode(ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION),
+											  errmsg("jsonpath item method .%s() can only be applied to a boolean, string, numeric, or datetime value",
+													 jspOperationName(jsp->type)))));
+					break;
+				}
 
 				res = executeNextItem(cxt, jsp, NULL, jb, found, true);
 			}
-- 
2.39.3 (Apple Git-146)

#3David E. Wheeler
david@justatheory.com
In reply to: Florents Tselai (#2)
Re: [PATCH] WIP: replace method for jsonpath

On Sep 16, 2024, at 18:39, Florents Tselai <florents.tselai@gmail.com> wrote:

Here’s an updated version of this patch.

Oh, nice function.

But a broader question for hackers: Is replace() specified in the SQL/JSON spec? If not, what’s the process for evaluating whether or not to add features not specified by the spec?

As a future note:
It’s worth noting that both this newly added jspItem and other ones like (jpiDecimal, jpiString)
use jspGetRightArg and jspGetLeftArg.
left and right can be confusing if more complex methods are added in the future.
i.e. jsonpath methods with nargs>=3 .
I was wondering if we’d like something like JSP_GETARG(n)

So far I think we have only functions defined by the spec, which tend to be unary or binary, so left and right haven’t been an issue.

Best,

David

#4Florents Tselai
florents.tselai@gmail.com
In reply to: David E. Wheeler (#3)
Re: [PATCH] WIP: replace method for jsonpath

On 17 Sep 2024, at 9:40 PM, David E. Wheeler <david@justatheory.com> wrote:

On Sep 16, 2024, at 18:39, Florents Tselai <florents.tselai@gmail.com> wrote:

Here’s an updated version of this patch.

Oh, nice function.

But a broader question for hackers: Is replace() specified in the SQL/JSON spec? If not, what’s the process for evaluating whether or not to add features not specified by the spec?

That’s the first argument I was expecting, and it’s a valid one.

From a user’s perspective the answer is:
Why not?
The more text-processing facilities I have in jsonb,
The less back-and-forth-parentheses-fu I do,
The easier my life is.

From a PG gatekeeping it’s a more complicated issue.

It’s not part of the spec,
But I think the jsonb infrastructure in PG is really powerful already and we can built on it,
And can evolve into a superset DSL of jsonpath.

For example, apache/age have lift-and-shifted this infra and built another DSL (cypher) on top of it.

As a future note:
It’s worth noting that both this newly added jspItem and other ones like (jpiDecimal, jpiString)
use jspGetRightArg and jspGetLeftArg.
left and right can be confusing if more complex methods are added in the future.
i.e. jsonpath methods with nargs>=3 .
I was wondering if we’d like something like JSP_GETARG(n)

So far I think we have only functions defined by the spec, which tend to be unary or binary, so left and right haven’t been an issue.

If the answer to the Q above is: “we stick to the spec” then there’s no thinking about this.

But tbh, I’ve already started experimenting with other text methods in text $.strip() / trim() / upper() / lower() etc.

Fallback scenario: make this an extension, but in a first pass I didn’t find any convenient hooks.
One has to create a whole new scanner, grammar etc.

Show quoted text

Best,

David

#5David E. Wheeler
david@justatheory.com
In reply to: Florents Tselai (#4)
Re: [PATCH] WIP: replace method for jsonpath

On Sep 17, 2024, at 15:03, Florents Tselai <florents.tselai@gmail.com> wrote:

Fallback scenario: make this an extension, but in a first pass I didn’t find any convenient hooks.
One has to create a whole new scanner, grammar etc.

Yeah, it got me thinking about the RFC-9535 JSONPath "Function Extension" feature[1]https://www.rfc-editor.org/rfc/rfc9535.html#name-function-extensions, which allows users to add functions. Would be cool to have a way to register jsonpath functions somehow, but I would imagine it’d need quite a bit of specification similar to RFC-9535. Wouldn’t surprise me to see something like that appear in a future version of the spec, with an interface something like CREATE OPERATOR.

I don’t have a strong feeling about what should be added that’s not in the spec; my main interest is not having to constantly sync my port[2]https://github.com/theory/sqljson. I’m already behind, and’t just been a couple months! 😂

Best,

David

[1]: https://www.rfc-editor.org/rfc/rfc9535.html#name-function-extensions
[2]: https://github.com/theory/sqljson

#6Peter Eisentraut
peter@eisentraut.org
In reply to: David E. Wheeler (#5)
Re: [PATCH] WIP: replace method for jsonpath

On 17.09.24 21:16, David E. Wheeler wrote:

On Sep 17, 2024, at 15:03, Florents Tselai <florents.tselai@gmail.com> wrote:

Fallback scenario: make this an extension, but in a first pass I didn’t find any convenient hooks.
One has to create a whole new scanner, grammar etc.

Yeah, it got me thinking about the RFC-9535 JSONPath "Function Extension" feature[1], which allows users to add functions. Would be cool to have a way to register jsonpath functions somehow, but I would imagine it’d need quite a bit of specification similar to RFC-9535. Wouldn’t surprise me to see something like that appear in a future version of the spec, with an interface something like CREATE OPERATOR.

Why can't we "just" call any suitable pg_proc-registered function from
JSON path? The proposed patch routes the example
'$.replace("hello","bye")' internally to the internal implementation of
the SQL function replace(..., 'hello', 'bye'). Why can't we do this
automatically for any function call in a JSON path expression?

#7Florents Tselai
florents.tselai@gmail.com
In reply to: Peter Eisentraut (#6)
Re: [PATCH] WIP: replace method for jsonpath

On 18 Sep 2024, at 11:23 AM, Peter Eisentraut <peter@eisentraut.org> wrote:

On 17.09.24 21:16, David E. Wheeler wrote:

On Sep 17, 2024, at 15:03, Florents Tselai <florents.tselai@gmail.com> wrote:

Fallback scenario: make this an extension, but in a first pass I didn’t find any convenient hooks.
One has to create a whole new scanner, grammar etc.

Yeah, it got me thinking about the RFC-9535 JSONPath "Function Extension" feature[1], which allows users to add functions. Would be cool to have a way to register jsonpath functions somehow, but I would imagine it’d need quite a bit of specification similar to RFC-9535. Wouldn’t surprise me to see something like that appear in a future version of the spec, with an interface something like CREATE OPERATOR.

Why can't we "just" call any suitable pg_proc-registered function from JSON path? The proposed patch routes the example '$.replace("hello","bye")' internally to the internal implementation of the SQL function replace(..., 'hello', 'bye'). Why can't we do this automatically for any function call in a JSON path expression?

Well, we can.
A couple of weeks ago, I discovered transform_jsonb_string_values, which is already available in jsonfuncs.h
and that gave me the idea for this extension https://github.com/Florents-Tselai/jsonb_apply

It does exactly what you’re saying: searches for a suitable pg_proc in the catalog, and directly applies it.

select jsonb_apply('{
"id": 1,
"name": "John",
"messages": [
"hello"
]
}', 'replace', 'hello', 'bye’);

select jsonb_filter_apply('{
"id": 1,
"name": "John",
"messages": [
"hello"
]
}', '{messages}', 'md5’);

But, I don't know… jsonb_apply? That seemed “too fancy”/LISPy for standard Postgres.

Now that you mention it, though, there’s an alternative of tweaking the grammar and calling the suitable text proc.

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#6)
Re: [PATCH] WIP: replace method for jsonpath

On 2024-09-18 We 4:23 AM, Peter Eisentraut wrote:

On 17.09.24 21:16, David E. Wheeler wrote:

On Sep 17, 2024, at 15:03, Florents Tselai
<florents.tselai@gmail.com> wrote:

Fallback scenario: make this an extension, but in a first pass I
didn’t find any convenient hooks.
One has to create a whole new scanner, grammar etc.

Yeah, it got me thinking about the RFC-9535 JSONPath "Function
Extension" feature[1], which allows users to add functions. Would be
cool to have a way to register jsonpath functions somehow, but I
would imagine it’d need quite a bit of specification similar to
RFC-9535. Wouldn’t surprise me to see something like that appear in a
future version of the spec, with an interface something like CREATE
OPERATOR.

Why can't we "just" call any suitable pg_proc-registered function from
JSON path?  The proposed patch routes the example
'$.replace("hello","bye")' internally to the internal implementation
of the SQL function replace(..., 'hello', 'bye'). Why can't we do this
automatically for any function call in a JSON path expression?

That might work. The thing that bothers me about the original proposal
is this: what if we add a new non-spec jsonpath method and then a new
version of the spec adds a method with the same name, but not compatible
with our method? We'll be in a nasty place. At the very least I think we
need to try hard to avoid that. Maybe we should prefix non-spec method
names with "pg_", or maybe use an initial capital letter.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#9Florents Tselai
florents.tselai@gmail.com
In reply to: Andrew Dunstan (#8)
Re: [PATCH] WIP: replace method for jsonpath

On 18 Sep 2024, at 3:47 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 2024-09-18 We 4:23 AM, Peter Eisentraut wrote:

On 17.09.24 21:16, David E. Wheeler wrote:

On Sep 17, 2024, at 15:03, Florents Tselai <florents.tselai@gmail.com>
wrote:

Fallback scenario: make this an extension, but in a first pass I didn’t
find any convenient hooks.
One has to create a whole new scanner, grammar etc.

Yeah, it got me thinking about the RFC-9535 JSONPath "Function Extension"
feature[1], which allows users to add functions. Would be cool to have a
way to register jsonpath functions somehow, but I would imagine it’d need
quite a bit of specification similar to RFC-9535. Wouldn’t surprise me to
see something like that appear in a future version of the spec, with an
interface something like CREATE OPERATOR.

Why can't we "just" call any suitable pg_proc-registered function from JSON
path? The proposed patch routes the example '$.replace("hello","bye")'
internally to the internal implementation of the SQL function replace(...,
'hello', 'bye'). Why can't we do this automatically for any function call
in a JSON path expression?

That might work. The thing that bothers me about the original proposal is
this: what if we add a new non-spec jsonpath method and then a new version
of the spec adds a method with the same name, but not compatible with our
method? We'll be in a nasty place. At the very least I think we need to try
hard to avoid that. Maybe we should prefix non-spec method names with
"pg_", or maybe use an initial capital letter.

If naming is your main reservation, then I take it you’re generally
positive.

Having said that, “pg_” is probably too long for a jsonpath expression,
Most importantly though, “pg_” in my mind is a prefix for things like
catalog lookup and system monitoring.
Not a function that the average user would use.
Thus, I lean towards initial-capital.

The more general case would look like:
A new jsonpath item of the format $.Func(arg1, …, argn) can be applied
(recursively or not) to a json object.

As a first iteration/version only pg_proc-registered functions of the
format func(text, ...,) -> text are available.
We can focus on oid(arg0) = TEXTOID and rettype = TEXTOID fist.
The first arg0 has to be TEXTOID (the current json string) while subsequent
args are provided by the user
in the jsonpath expression.

The functions we want to support will be part of jsonpath grammar
and during execution we'll have enough info to find the appropriate
PGFunction to call.

What I'm missing yet is how we could handle vars jsonb,
in case the user doesn't want to just hardcode the actual function
arguments.
Then resolving argtypes1...n is a bit more complex:

The signature for jsonb_apply(doc jsonb, func text[, variadic "any"
args1_n]); [0]https://github.com/Florents-Tselai/jsonb_apply
Here, variadic "any" works beautifully, but that's a brand-new function.

In existing jsonb{path} facilities, vars are jsonb objects which could work
as well I think.
Unless I'm missing something.

[0]: https://github.com/Florents-Tselai/jsonb_apply

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com