ON ERROR in json_query and the like
Hi!
I’ve noticed two “surprising” (to me) behaviors related to
the “ON ERROR” clause of the new JSON query functions in 17beta1.
1. JSON parsing errors are not subject to ON ERROR
Apparently, the functions expect JSONB so that a cast is implied
when providing TEXT. However, the errors during that cast are
not subject to the ON ERROR clause.
17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
ERROR: invalid input syntax for type json
DETAIL: Token "invalid" is invalid.
CONTEXT: JSON data, line 1: invalid
Oracle DB and Db2 (LUW) both return NULL in that case.
I had a look on the list archive to see if that is intentional but
frankly speaking these functions came a long way. In case it is
intentional it might be worth adding a note to the docs.
2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
a
----
[]
(1 row)
As NULL ON EMPTY is implied, it should give the same result as
explicitly adding NULL ON EMPTY:
17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON ERROR) a;
a
---
(1 row)
Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
on the other hand returns NULL for both queries.
I don’t think that PostgreSQL should follow Oracle DB's suit here
but again, in case this is intentional it should be made explicit
in the docs.
-markus
út 28. 5. 2024 v 11:29 odesílatel Markus Winand <markus.winand@winand.at>
napsal:
Hi!
I’ve noticed two “surprising” (to me) behaviors related to
the “ON ERROR” clause of the new JSON query functions in 17beta1.1. JSON parsing errors are not subject to ON ERROR
Apparently, the functions expect JSONB so that a cast is implied
when providing TEXT. However, the errors during that cast are
not subject to the ON ERROR clause.17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
ERROR: invalid input syntax for type json
DETAIL: Token "invalid" is invalid.
CONTEXT: JSON data, line 1: invalidOracle DB and Db2 (LUW) both return NULL in that case.
I had a look on the list archive to see if that is intentional but
frankly speaking these functions came a long way. In case it is
intentional it might be worth adding a note to the docs.
I remember a talk about this subject years ago. Originally the JSON_QUERY
was designed in similar like Oracle, and casting to jsonb was done inside.
If I remember this behave depends on the fact, so old SQL/JSON has not json
type and it was based just on processing of plain text. But Postgres has
JSON, and JSONB and then was more logical to use these types. And because
the JSON_QUERY uses these types, and the casting is done before the
execution of the function, then the clause ON ERROR cannot be handled.
Moreover, until soft errors Postgres didn't allow handling input errors in
common functions.
I think so this difference should be mentioned in documentation.
Regards
Pavel
Show quoted text
2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
a
----
[]
(1 row)As NULL ON EMPTY is implied, it should give the same result as
explicitly adding NULL ON EMPTY:17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON
ERROR) a;
a
---(1 row)
Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
on the other hand returns NULL for both queries.I don’t think that PostgreSQL should follow Oracle DB's suit here
but again, in case this is intentional it should be made explicit
in the docs.-markus
On Tue, May 28, 2024 at 5:29 PM Markus Winand <markus.winand@winand.at> wrote:
Hi!
I’ve noticed two “surprising” (to me) behaviors related to
the “ON ERROR” clause of the new JSON query functions in 17beta1.1. JSON parsing errors are not subject to ON ERROR
Apparently, the functions expect JSONB so that a cast is implied
when providing TEXT. However, the errors during that cast are
not subject to the ON ERROR clause.17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
ERROR: invalid input syntax for type json
DETAIL: Token "invalid" is invalid.
CONTEXT: JSON data, line 1: invalidOracle DB and Db2 (LUW) both return NULL in that case.
I had a look on the list archive to see if that is intentional but
frankly speaking these functions came a long way. In case it is
intentional it might be worth adding a note to the docs.
previous versions require SQL/JSON query function's context_item to
explicitly cast to jsonb,
if it is not it will error out.
previous version the following query will have a error
select json_value(text '"1"' , 'strict $[*]' DEFAULT 9 ON ERROR);
now it only requires that (context_item) casting to jsonb successfully.
I raise this issue separately at [1]/messages/by-id/CACJufxGWJTa-b0WjNH15otih42PA7SF+e7LbkAb0gThs7ojT5Q@mail.gmail.com
[1]: /messages/by-id/CACJufxGWJTa-b0WjNH15otih42PA7SF+e7LbkAb0gThs7ojT5Q@mail.gmail.com
2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
a
----
[]
(1 row)As NULL ON EMPTY is implied, it should give the same result as
explicitly adding NULL ON EMPTY:
I vaguely remember, we stumbled on ON ERROR, ON EMPTY several times.
i don't have a standard, but the doc seems not explicit enough for the
above example.
in json_query, maybe we can rephrase like:
--------------------
The ON EMPTY clause specifies the behavior if evaluating
path_expression yields no value at all. The default when ON EMPTY is
not specified
and ON ERROR not specified is to return a null value.
The ON ERROR clause specifies the behavior if an error occurs when
evaluating path_expression, including evaluation yields no value at
all and ON EMPTY is not specified, the operation to coerce the result
value to the output type, or during the execution of ON EMPTY behavior
(that is caused by empty result of path_expression evaluation). The
default when ON ERROR is not specified is to return a null value.
17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON ERROR) a;
a
---(1 row)
Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
on the other hand returns NULL for both queries.I don’t think that PostgreSQL should follow Oracle DB's suit here
but again, in case this is intentional it should be made explicit
in the docs.
`
SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON ERROR) a;
`
I think these sentences addressed the above query.
<<<
or during the execution of ON EMPTY behavior (that is caused by empty
result of path_expression evaluation).
<<<
As you can see, in this context, "execution of ON EMPTY behavior"
works fine, successfully returned null,
so `EMPTY ARRAY ON ERROR` part was ignored.
On Tue, May 28, 2024 at 5:29 PM Markus Winand <markus.winand@winand.at> wrote:
Hi!
I’ve noticed two “surprising” (to me) behaviors related to
the “ON ERROR” clause of the new JSON query functions in 17beta1.1. JSON parsing errors are not subject to ON ERROR
Apparently, the functions expect JSONB so that a cast is implied
when providing TEXT. However, the errors during that cast are
not subject to the ON ERROR clause.17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
ERROR: invalid input syntax for type json
DETAIL: Token "invalid" is invalid.
CONTEXT: JSON data, line 1: invalidOracle DB and Db2 (LUW) both return NULL in that case.
I had a look on the list archive to see if that is intentional but
frankly speaking these functions came a long way. In case it is
intentional it might be worth adding a note to the docs.
json_query ( context_item, path_expression);
`SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);`
to make this return NULL, that means to catch all the errors that
happened while context_item evaluation.
otherwise, it would not be consistent?
Currently context_item expressions can be quite arbitrary.
considering the following examples.
create or replace function test(jsonb) returns jsonb as $$ begin raise
exception 'abort'; end $$ language plpgsql;
create or replace function test1(jsonb) returns jsonb as $$ begin
return $1; end $$ language plpgsql;
SELECT JSON_VALUE(test('1'), '$');
SELECT JSON_VALUE(test1('1'), '$');
SELECT JSON_VALUE((select '1'::jsonb), '$');
SELECT JSON_VALUE((with cte(s) as (select '1') select s::jsonb from cte), '$');
SELECT JSON_VALUE((with cte(s) as (select '1') select s::jsonb from
cte union all select s::jsonb from cte limit 1), '$');
Currently, I don't think we can make
SELECT JSON_VALUE(test('1'), '$' null on error);
return NULL.
On 11.06.2024, at 03:58, jian he <jian.universality@gmail.com> wrote:
On Tue, May 28, 2024 at 5:29 PM Markus Winand <markus.winand@winand.at> wrote:
Hi!
I’ve noticed two “surprising” (to me) behaviors related to
the “ON ERROR” clause of the new JSON query functions in 17beta1.1. JSON parsing errors are not subject to ON ERROR
Apparently, the functions expect JSONB so that a cast is implied
when providing TEXT. However, the errors during that cast are
not subject to the ON ERROR clause.17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
ERROR: invalid input syntax for type json
DETAIL: Token "invalid" is invalid.
CONTEXT: JSON data, line 1: invalidOracle DB and Db2 (LUW) both return NULL in that case.
I had a look on the list archive to see if that is intentional but
frankly speaking these functions came a long way. In case it is
intentional it might be worth adding a note to the docs.json_query ( context_item, path_expression);
`SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);`
to make this return NULL, that means to catch all the errors that
happened while context_item evaluation.
otherwise, it would not be consistent?Currently context_item expressions can be quite arbitrary.
considering the following examples.create or replace function test(jsonb) returns jsonb as $$ begin raise
exception 'abort'; end $$ language plpgsql;
create or replace function test1(jsonb) returns jsonb as $$ begin
return $1; end $$ language plpgsql;
SELECT JSON_VALUE(test('1'), '$');
SELECT JSON_VALUE(test1('1'), '$');
SELECT JSON_VALUE((select '1'::jsonb), '$');
SELECT JSON_VALUE((with cte(s) as (select '1') select s::jsonb from cte), '$');
SELECT JSON_VALUE((with cte(s) as (select '1') select s::jsonb from
cte union all select s::jsonb from cte limit 1), '$');Currently, I don't think we can make
SELECT JSON_VALUE(test('1'), '$' null on error);
return NULL.
This is not how it is meant. Your example is not subject to the ON ERROR
clause because the error happens in a sub-expression. My point is that
ON ERROR includes the String to JSON conversion (the JSON parsing) that
— in the way the standard describes these functions — inside of them.
In the standard, JSON_VALUE & co accept string types as well as the type JSON:
10.14 SR 1: The declared type of the <value expression> simply contained in the <JSON input expression> immediately contained in the <JSON context item> shall be a string type or a JSON type.
It might be best to think of it as two separate functions, overloaded:
JSON_VALUE(context_item JSONB, path_expression …)
JSON_VALUE(context_item TEXT, path_expression …)
Now if you do this:
create function test2(text) returns text as $$ begin
return $1; end $$ language plpgsql;
create function test3(text) returns jsonb as $$ begin
return $1::jsonb; end $$ language plpgsql;
SELECT JSON_VALUE(test2('invalid'), '$' null on error);
SELECT JSON_VALUE(test3('invalid'), '$' null on error);
The first query should return NULL, while the second should (and does) fail.
This is how I understand it.
-markus
On Wednesday, June 12, 2024, Markus Winand <markus.winand@winand.at> wrote:
10.14 SR 1: The declared type of the <value expression> simply contained
in the <JSON input expression> immediately contained in the <JSON context
item> shall be a string type or a JSON type.
It might be best to think of it as two separate functions, overloaded:
JSON_VALUE(context_item JSONB, path_expression …)
JSON_VALUE(context_item TEXT, path_expression …)
Yes, we need to document that we deviate from (fail to fully implement) the
standard here in that we only provide jsonb parameter functions, not text
ones.
David J.
On 04.06.2024, at 07:00, jian he <jian.universality@gmail.com> wrote:
On Tue, May 28, 2024 at 5:29 PM Markus Winand <markus.winand@winand.at> wrote:
2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
a
----
[]
(1 row)As NULL ON EMPTY is implied, it should give the same result as
explicitly adding NULL ON EMPTY:I vaguely remember, we stumbled on ON ERROR, ON EMPTY several times.
i don't have a standard,
In my understanding of the standard is that there is no distinction
between an explicit and implicit ON EMPTY clause.
E.g. clause 6.35 (json_query) Syntax Rule 4:
• If <JSON query empty behavior> is not specified, then NULL ON EMPTY is implicit.
General Rule 5ai then covers the NULL ON EMPTY case:
• i) If the length of SEQ2 is 0 (zero) and ONEMPTY is NULL, then let JV be the null value.
Neither of these make the ON EMPTY handling dependent on the presence of ON ERROR.
-markus
On Tuesday, May 28, 2024, Markus Winand <markus.winand@winand.at> wrote:
2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
a
----
[]
(1 row)As NULL ON EMPTY is implied, it should give the same result as
explicitly adding NULL ON EMPTY:17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON
ERROR) a;
a
---(1 row)
Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
on the other hand returns NULL for both queries.I don’t think that PostgreSQL should follow Oracle DB's suit here
but again, in case this is intentional it should be made explicit
in the docs.
The docs here don’t seem to cover the on empty clause at all nor fully
cover all options.
Where do you find the claim that the one implies the other? Is it a typo
that your examples says “implies null on empty” but the subject line says
“implies error on empty”?
Without those clauses a result is either empty or an error - they are
mutually exclusive (ignoring matches). I would not expect one clause to
imply or affect the behavior of the other. There is no chaining. The
original result is transformed to the new result specified by the clause.
I’d need to figure out whether the example you show is actually producing
empty or error; but it seems correct if the result is empty. The first
query ignores the error clause - the empty array row seems to be the
representation of empty here; the second one matches the empty clause and
outputs null instead of the empty array.
David J.
On 12.06.2024, at 15:31, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Tuesday, May 28, 2024, Markus Winand <markus.winand@winand.at> wrote:
2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
a
----
[]
(1 row)As NULL ON EMPTY is implied, it should give the same result as
explicitly adding NULL ON EMPTY:17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON ERROR) a;
a
---(1 row)
Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
on the other hand returns NULL for both queries.I don’t think that PostgreSQL should follow Oracle DB's suit here
but again, in case this is intentional it should be made explicit
in the docs.The docs here don’t seem to cover the on empty clause at all nor fully cover all options.
Where do you find the claim that the one implies the other? Is it a typo that your examples says “implies null on empty” but the subject line says “implies error on empty”?
I see the confusion caused — sorry. The headline was meant to describe the observed behaviour in 17beta1, while the content refers to how the standard defines it.
Without those clauses a result is either empty or an error - they are mutually exclusive (ignoring matches). I would not expect one clause to imply or affect the behavior of the other. There is no chaining. The original result is transformed to the new result specified by the clause.
Agreed, that’s why I found the 17beta1 behaviour surprising.
I’d need to figure out whether the example you show is actually producing empty or error; but it seems correct if the result is empty.
As I understand the standard, an empty result is not an error.
The first query ignores the error clause - the empty array row seems to be the representation of empty here; the second one matches the empty clause and outputs null instead of the empty array.
But the first should behave the same, as the standard implies NULL ON EMPTY if there is no explicit ON EMPTY clause. Oracle DB behaving differently here makes me wonder if there is something in the standard that I haven’t noticed yet...
-markus
Hi,
(apologies for not replying to this thread sooner)
On Tue, May 28, 2024 at 6:57 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
út 28. 5. 2024 v 11:29 odesílatel Markus Winand <markus.winand@winand.at> napsal:
Hi!
I’ve noticed two “surprising” (to me) behaviors related to
the “ON ERROR” clause of the new JSON query functions in 17beta1.1. JSON parsing errors are not subject to ON ERROR
Apparently, the functions expect JSONB so that a cast is implied
when providing TEXT. However, the errors during that cast are
not subject to the ON ERROR clause.17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
ERROR: invalid input syntax for type json
DETAIL: Token "invalid" is invalid.
CONTEXT: JSON data, line 1: invalidOracle DB and Db2 (LUW) both return NULL in that case.
I had a look on the list archive to see if that is intentional but
frankly speaking these functions came a long way. In case it is
intentional it might be worth adding a note to the docs.I remember a talk about this subject years ago. Originally the JSON_QUERY was designed in similar like Oracle, and casting to jsonb was done inside. If I remember this behave depends on the fact, so old SQL/JSON has not json type and it was based just on processing of plain text. But Postgres has JSON, and JSONB and then was more logical to use these types. And because the JSON_QUERY uses these types, and the casting is done before the execution of the function, then the clause ON ERROR cannot be handled. Moreover, until soft errors Postgres didn't allow handling input errors in common functions.
I think so this difference should be mentioned in documentation.
Agree that the documentation needs to be clear about this. I'll update
my patch at [1]/messages/by-id/CA+HiwqGdineyHfcTEe0=8jjXonH3qXi4vFB+gRxf1L+xR2v_Pw@mail.gmail.com to add a note next to table 9.16.3. SQL/JSON Query
Functions.
2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
a
----
[]
(1 row)As NULL ON EMPTY is implied, it should give the same result as
explicitly adding NULL ON EMPTY:17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON ERROR) a;
a
---(1 row)
Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
on the other hand returns NULL for both queries.I don’t think that PostgreSQL should follow Oracle DB's suit here
but again, in case this is intentional it should be made explicit
in the docs.
This behavior is a bug and result of an unintentional change that I
made at some point after getting involved with this patch set. So I'm
going to fix this so that the empty results of jsonpath evaluation use
NULL ON EMPTY by default, ie, when the ON EMPTY clause is not present.
Attached a patch to do so.
--
Thanks, Amit Langote
[1]: /messages/by-id/CA+HiwqGdineyHfcTEe0=8jjXonH3qXi4vFB+gRxf1L+xR2v_Pw@mail.gmail.com
Attachments:
v1-0001-SQL-JSON-Correctly-enforce-the-default-ON-EMPTY-b.patchapplication/octet-stream; name=v1-0001-SQL-JSON-Correctly-enforce-the-default-ON-EMPTY-b.patchDownload
From ac866439c2a0aabaef5ed7661a1e60699c5df570 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlan@postgresql.org>
Date: Wed, 12 Jun 2024 12:07:57 +0900
Subject: [PATCH v1] SQL/JSON: Correctly enforce the default ON EMPTY behavior
Currently, when the ON EMPTY clause is not present, the ON ERROR
clause (implicit or explicit) dictates the behavior when jsonpath
evaluation in ExecEvalJsonExprPath() results in an empty sequence.
That is an oversight in the commit 6185c9737c.
This commit fixes things so that a NULL is returned instead in that
case which is the default behavior when the ON EMPTY clause is not
present.
Reported-by: Markus Winand
---
src/backend/parser/parse_expr.c | 27 ++++++++++------
.../regress/expected/sqljson_jsontable.out | 19 ++++++++---
.../regress/expected/sqljson_queryfuncs.out | 32 ++++++++++++-------
src/test/regress/sql/sqljson_jsontable.sql | 8 ++---
src/test/regress/sql/sqljson_queryfuncs.sql | 18 +++++------
5 files changed, 64 insertions(+), 40 deletions(-)
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index c47034188e..3f3042d149 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -4423,11 +4423,11 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
coerceJsonExprOutput(pstate, jsexpr);
- if (func->on_empty)
- jsexpr->on_empty = transformJsonBehavior(pstate,
- func->on_empty,
- JSON_BEHAVIOR_NULL,
- jsexpr->returning);
+ /* Assume NULL ON EMPTY when ON EMPTY is not specified. */
+ jsexpr->on_empty = transformJsonBehavior(pstate, func->on_empty,
+ JSON_BEHAVIOR_NULL,
+ jsexpr->returning);
+ /* Assume NULL ON ERROR when ON ERROR is not specified. */
jsexpr->on_error = transformJsonBehavior(pstate, func->on_error,
JSON_BEHAVIOR_NULL,
jsexpr->returning);
@@ -4453,11 +4453,11 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
coerceJsonExprOutput(pstate, jsexpr);
- if (func->on_empty)
- jsexpr->on_empty = transformJsonBehavior(pstate,
- func->on_empty,
- JSON_BEHAVIOR_NULL,
- jsexpr->returning);
+ /* Assume NULL ON EMPTY when ON EMPTY is not specified. */
+ jsexpr->on_empty = transformJsonBehavior(pstate, func->on_empty,
+ JSON_BEHAVIOR_NULL,
+ jsexpr->returning);
+ /* Assume NULL ON ERROR when ON ERROR is not specified. */
jsexpr->on_error = transformJsonBehavior(pstate, func->on_error,
JSON_BEHAVIOR_NULL,
jsexpr->returning);
@@ -4469,6 +4469,13 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
jsexpr->returning->typid = exprType(jsexpr->formatted_expr);
jsexpr->returning->typmod = -1;
}
+
+ /*
+ * Assume EMPTY ON ERROR when ON ERROR is not specified.
+ *
+ * ON EMPTY cannot be specified at the top level but it can be for
+ * the individual columns.
+ */
jsexpr->on_error = transformJsonBehavior(pstate, func->on_error,
JSON_BEHAVIOR_EMPTY,
jsexpr->returning);
diff --git a/src/test/regress/expected/sqljson_jsontable.out b/src/test/regress/expected/sqljson_jsontable.out
index 2d74fba506..3b70ed9548 100644
--- a/src/test/regress/expected/sqljson_jsontable.out
+++ b/src/test/regress/expected/sqljson_jsontable.out
@@ -219,17 +219,17 @@ FROM json_table_test vals
-- Test using casts in DEFAULT .. ON ERROR expression
SELECT * FROM JSON_TABLE(jsonb '{"d1": "H"}', '$'
- COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT '"foo1"'::jsonb::text ON ERROR));
+ COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT '"foo1"'::jsonb::text ON EMPTY));
js1
--------
"foo1"
(1 row)
SELECT * FROM JSON_TABLE(jsonb '{"d1": "H"}', '$'
- COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT 'foo'::jsonb_test_domain ON ERROR));
+ COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT 'foo'::jsonb_test_domain ON EMPTY));
ERROR: value for domain jsonb_test_domain violates check constraint "jsonb_test_domain_check"
SELECT * FROM JSON_TABLE(jsonb '{"d1": "H"}', '$'
- COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT 'foo1'::jsonb_test_domain ON ERROR));
+ COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT 'foo1'::jsonb_test_domain ON EMPTY));
js1
------
foo1
@@ -243,7 +243,7 @@ SELECT * FROM JSON_TABLE(jsonb '{"d1": "foo"}', '$'
(1 row)
SELECT * FROM JSON_TABLE(jsonb '{"d1": "foo"}', '$'
- COLUMNS (js1 oid[] PATH '$.d2' DEFAULT '{1}'::int[]::oid[] ON ERROR));
+ COLUMNS (js1 oid[] PATH '$.d2' DEFAULT '{1}'::int[]::oid[] ON EMPTY));
js1
-----
{1}
@@ -885,7 +885,16 @@ SELECT sub.* FROM s,
xx int path '$.c',
NESTED PATH '$.a.za[1]' columns (NESTED PATH '$.z21[*]' COLUMNS (z21 int path '$?(@ >= $"x")' ERROR ON ERROR))
)) sub;
-ERROR: no SQL/JSON item found for specified path of column "z21"
+ xx | z21
+----+------
+ 3 |
+ 3 | 234
+ 3 | 2345
+ 10 | 32
+ 10 | 134
+ 10 | 1345
+(6 rows)
+
-- Parent columns xx1, xx appear before NESTED ones
SELECT sub.* FROM s,
(VALUES (23)) x(x), generate_series(13, 13) y,
diff --git a/src/test/regress/expected/sqljson_queryfuncs.out b/src/test/regress/expected/sqljson_queryfuncs.out
index 24672524ad..6ca5674cef 100644
--- a/src/test/regress/expected/sqljson_queryfuncs.out
+++ b/src/test/regress/expected/sqljson_queryfuncs.out
@@ -370,11 +370,15 @@ SELECT JSON_VALUE(jsonb '1', 'strict $.a' DEFAULT 'error' ON ERROR);
error
(1 row)
-SELECT JSON_VALUE(jsonb '1', 'lax $.a' ERROR ON ERROR);
-ERROR: no SQL/JSON item found for specified path
+SELECT JSON_VALUE(jsonb '1', 'lax $.a' ERROR ON ERROR); -- NULL ON EMPTY
+ json_value
+------------
+
+(1 row)
+
SELECT JSON_VALUE(jsonb '1', 'lax $.a' ERROR ON EMPTY ERROR ON ERROR);
ERROR: no SQL/JSON item found for specified path
-SELECT JSON_VALUE(jsonb '1', 'strict $.a' DEFAULT 2 ON ERROR);
+SELECT JSON_VALUE(jsonb '1', 'strict $.*' DEFAULT 2 ON ERROR);
json_value
------------
2
@@ -383,10 +387,10 @@ SELECT JSON_VALUE(jsonb '1', 'strict $.a' DEFAULT 2 ON ERROR);
SELECT JSON_VALUE(jsonb '1', 'lax $.a' DEFAULT 2 ON ERROR);
json_value
------------
- 2
+
(1 row)
-SELECT JSON_VALUE(jsonb '1', 'lax $.a' DEFAULT '2' ON ERROR);
+SELECT JSON_VALUE(jsonb '1', 'lax $.a' DEFAULT '2' ON EMPTY);
json_value
------------
2
@@ -783,8 +787,12 @@ SELECT JSON_QUERY(jsonb '[]', '$[*]' ERROR ON EMPTY EMPTY OBJECT ON ERROR);
ERROR: no SQL/JSON item found for specified path
SELECT JSON_QUERY(jsonb '[]', '$[*]' ERROR ON EMPTY ERROR ON ERROR);
ERROR: no SQL/JSON item found for specified path
-SELECT JSON_QUERY(jsonb '[]', '$[*]' ERROR ON ERROR);
-ERROR: no SQL/JSON item found for specified path
+SELECT JSON_QUERY(jsonb '[]', '$[*]' ERROR ON ERROR); -- NULL ON EMPTY
+ json_query
+------------
+
+(1 row)
+
SELECT JSON_QUERY(jsonb '[1,2]', '$[*]' ERROR ON ERROR);
ERROR: JSON path expression in JSON_QUERY should return single item without wrapper
HINT: Use WITH WRAPPER clause to wrap SQL/JSON items into array.
@@ -1036,7 +1044,7 @@ SELECT JSON_QUERY(jsonb '{"a": 1}', '$.b' RETURNING sqljsonb_int_not_null);
(1 row)
-SELECT JSON_QUERY(jsonb '{"a": 1}', '$.b' RETURNING sqljsonb_int_not_null ERROR ON ERROR);
+SELECT JSON_QUERY(jsonb '{"a": 1}', '$.b' RETURNING sqljsonb_int_not_null ERROR ON EMPTY ERROR ON ERROR);
ERROR: no SQL/JSON item found for specified path
-- Test timestamptz passing and output
SELECT JSON_QUERY(jsonb 'null', '$ts' PASSING timestamptz '2018-02-21 12:34:56 +10' AS ts);
@@ -1227,21 +1235,21 @@ LINE 1: SELECT JSON_QUERY(js, '$' RETURNING int DEFAULT (SELECT 1) ...
DROP TABLE test_jsonb_mutability;
DROP FUNCTION ret_setint;
CREATE DOMAIN queryfuncs_test_domain AS text CHECK (value <> 'foo');
-SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo'::queryfuncs_test_domain ON ERROR);
+SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo'::queryfuncs_test_domain ON EMPTY);
ERROR: value for domain queryfuncs_test_domain violates check constraint "queryfuncs_test_domain_check"
-SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo1'::queryfuncs_test_domain ON ERROR);
+SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo1'::queryfuncs_test_domain ON EMPTY);
json_value
------------
foo1
(1 row)
-SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT '"foo1"'::jsonb::text ON ERROR);
+SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT '"foo1"'::jsonb::text ON EMPTY);
json_value
------------
"foo1"
(1 row)
-SELECT JSON_VALUE(jsonb '{"d1": "foo"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo1'::queryfuncs_test_domain ON ERROR);
+SELECT JSON_VALUE(jsonb '{"d1": "foo"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo1'::queryfuncs_test_domain ON EMPTY);
json_value
------------
foo1
diff --git a/src/test/regress/sql/sqljson_jsontable.sql b/src/test/regress/sql/sqljson_jsontable.sql
index f8f996f935..a1f924146e 100644
--- a/src/test/regress/sql/sqljson_jsontable.sql
+++ b/src/test/regress/sql/sqljson_jsontable.sql
@@ -118,19 +118,19 @@ FROM json_table_test vals
-- Test using casts in DEFAULT .. ON ERROR expression
SELECT * FROM JSON_TABLE(jsonb '{"d1": "H"}', '$'
- COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT '"foo1"'::jsonb::text ON ERROR));
+ COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT '"foo1"'::jsonb::text ON EMPTY));
SELECT * FROM JSON_TABLE(jsonb '{"d1": "H"}', '$'
- COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT 'foo'::jsonb_test_domain ON ERROR));
+ COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT 'foo'::jsonb_test_domain ON EMPTY));
SELECT * FROM JSON_TABLE(jsonb '{"d1": "H"}', '$'
- COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT 'foo1'::jsonb_test_domain ON ERROR));
+ COLUMNS (js1 jsonb_test_domain PATH '$.a2' DEFAULT 'foo1'::jsonb_test_domain ON EMPTY));
SELECT * FROM JSON_TABLE(jsonb '{"d1": "foo"}', '$'
COLUMNS (js1 jsonb_test_domain PATH '$.d1' DEFAULT 'foo2'::jsonb_test_domain ON ERROR));
SELECT * FROM JSON_TABLE(jsonb '{"d1": "foo"}', '$'
- COLUMNS (js1 oid[] PATH '$.d2' DEFAULT '{1}'::int[]::oid[] ON ERROR));
+ COLUMNS (js1 oid[] PATH '$.d2' DEFAULT '{1}'::int[]::oid[] ON EMPTY));
-- JSON_TABLE: Test backward parsing
diff --git a/src/test/regress/sql/sqljson_queryfuncs.sql b/src/test/regress/sql/sqljson_queryfuncs.sql
index 7368cfcd92..919ab7e651 100644
--- a/src/test/regress/sql/sqljson_queryfuncs.sql
+++ b/src/test/regress/sql/sqljson_queryfuncs.sql
@@ -89,11 +89,11 @@ SELECT JSON_VALUE(jsonb '{}', '$' ERROR ON ERROR);
SELECT JSON_VALUE(jsonb '1', '$.a');
SELECT JSON_VALUE(jsonb '1', 'strict $.a' ERROR ON ERROR);
SELECT JSON_VALUE(jsonb '1', 'strict $.a' DEFAULT 'error' ON ERROR);
-SELECT JSON_VALUE(jsonb '1', 'lax $.a' ERROR ON ERROR);
+SELECT JSON_VALUE(jsonb '1', 'lax $.a' ERROR ON ERROR); -- NULL ON EMPTY
SELECT JSON_VALUE(jsonb '1', 'lax $.a' ERROR ON EMPTY ERROR ON ERROR);
-SELECT JSON_VALUE(jsonb '1', 'strict $.a' DEFAULT 2 ON ERROR);
+SELECT JSON_VALUE(jsonb '1', 'strict $.*' DEFAULT 2 ON ERROR);
SELECT JSON_VALUE(jsonb '1', 'lax $.a' DEFAULT 2 ON ERROR);
-SELECT JSON_VALUE(jsonb '1', 'lax $.a' DEFAULT '2' ON ERROR);
+SELECT JSON_VALUE(jsonb '1', 'lax $.a' DEFAULT '2' ON EMPTY);
SELECT JSON_VALUE(jsonb '1', 'lax $.a' NULL ON EMPTY DEFAULT '2' ON ERROR);
SELECT JSON_VALUE(jsonb '1', 'lax $.a' DEFAULT '2' ON EMPTY DEFAULT '3' ON ERROR);
SELECT JSON_VALUE(jsonb '1', 'lax $.a' ERROR ON EMPTY DEFAULT '3' ON ERROR);
@@ -227,7 +227,7 @@ SELECT JSON_QUERY(jsonb '[]', '$[*]' ERROR ON EMPTY NULL ON ERROR);
SELECT JSON_QUERY(jsonb '[]', '$[*]' ERROR ON EMPTY EMPTY ARRAY ON ERROR);
SELECT JSON_QUERY(jsonb '[]', '$[*]' ERROR ON EMPTY EMPTY OBJECT ON ERROR);
SELECT JSON_QUERY(jsonb '[]', '$[*]' ERROR ON EMPTY ERROR ON ERROR);
-SELECT JSON_QUERY(jsonb '[]', '$[*]' ERROR ON ERROR);
+SELECT JSON_QUERY(jsonb '[]', '$[*]' ERROR ON ERROR); -- NULL ON EMPTY
SELECT JSON_QUERY(jsonb '[1,2]', '$[*]' ERROR ON ERROR);
SELECT JSON_QUERY(jsonb '[1,2]', '$[*]' DEFAULT '"empty"' ON ERROR);
@@ -306,7 +306,7 @@ SELECT * FROM unnest(JSON_QUERY(jsonb '[{"a": 1, "t": ["foo", []]}, {"a": 2, "jb
-- Extension: domain types returning
SELECT JSON_QUERY(jsonb '{"a": 1}', '$.a' RETURNING sqljsonb_int_not_null);
SELECT JSON_QUERY(jsonb '{"a": 1}', '$.b' RETURNING sqljsonb_int_not_null);
-SELECT JSON_QUERY(jsonb '{"a": 1}', '$.b' RETURNING sqljsonb_int_not_null ERROR ON ERROR);
+SELECT JSON_QUERY(jsonb '{"a": 1}', '$.b' RETURNING sqljsonb_int_not_null ERROR ON EMPTY ERROR ON ERROR);
-- Test timestamptz passing and output
SELECT JSON_QUERY(jsonb 'null', '$ts' PASSING timestamptz '2018-02-21 12:34:56 +10' AS ts);
@@ -414,10 +414,10 @@ DROP TABLE test_jsonb_mutability;
DROP FUNCTION ret_setint;
CREATE DOMAIN queryfuncs_test_domain AS text CHECK (value <> 'foo');
-SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo'::queryfuncs_test_domain ON ERROR);
-SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo1'::queryfuncs_test_domain ON ERROR);
-SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT '"foo1"'::jsonb::text ON ERROR);
-SELECT JSON_VALUE(jsonb '{"d1": "foo"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo1'::queryfuncs_test_domain ON ERROR);
+SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo'::queryfuncs_test_domain ON EMPTY);
+SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo1'::queryfuncs_test_domain ON EMPTY);
+SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT '"foo1"'::jsonb::text ON EMPTY);
+SELECT JSON_VALUE(jsonb '{"d1": "foo"}', '$.a2' RETURNING queryfuncs_test_domain DEFAULT 'foo1'::queryfuncs_test_domain ON EMPTY);
-- Check the cases where a coercion-related expression is masking an
-- unsupported expressions
--
2.43.0
Hi,
On 06/17/24 02:20, Amit Langote wrote:
Apparently, the functions expect JSONB so that a cast is implied
when providing TEXT. However, the errors during that cast are
not subject to the ON ERROR clause.17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
ERROR: invalid input syntax for type json
DETAIL: Token "invalid" is invalid.
CONTEXT: JSON data, line 1: invalidOracle DB and Db2 (LUW) both return NULL in that case.
I wonder, could prosupport rewriting be used to detect that the first
argument is supplied by a cast, and rewrite the expression to apply the
cast 'softly'? Or would that behavior be too magical?
Regards,
-Chap
On 17.06.2024, at 08:20, Amit Langote <amitlangote09@gmail.com> wrote:
Hi,
(apologies for not replying to this thread sooner)
On Tue, May 28, 2024 at 6:57 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
út 28. 5. 2024 v 11:29 odesílatel Markus Winand <markus.winand@winand.at> napsal:
Hi!
I’ve noticed two “surprising” (to me) behaviors related to
the “ON ERROR” clause of the new JSON query functions in 17beta1.1. JSON parsing errors are not subject to ON ERROR
Apparently, the functions expect JSONB so that a cast is implied
when providing TEXT. However, the errors during that cast are
not subject to the ON ERROR clause.17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
ERROR: invalid input syntax for type json
DETAIL: Token "invalid" is invalid.
CONTEXT: JSON data, line 1: invalidOracle DB and Db2 (LUW) both return NULL in that case.
I had a look on the list archive to see if that is intentional but
frankly speaking these functions came a long way. In case it is
intentional it might be worth adding a note to the docs.I remember a talk about this subject years ago. Originally the JSON_QUERY was designed in similar like Oracle, and casting to jsonb was done inside. If I remember this behave depends on the fact, so old SQL/JSON has not json type and it was based just on processing of plain text. But Postgres has JSON, and JSONB and then was more logical to use these types. And because the JSON_QUERY uses these types, and the casting is done before the execution of the function, then the clause ON ERROR cannot be handled. Moreover, until soft errors Postgres didn't allow handling input errors in common functions.
I think so this difference should be mentioned in documentation.
Agree that the documentation needs to be clear about this. I'll update
my patch at [1] to add a note next to table 9.16.3. SQL/JSON Query
Functions.
Considering another branch of this thread [1]/messages/by-id/CAKFQuwb50BAaj83Np+1O6xe3_T6DO8w2mxtFbgSbbUng+abrqA@mail.gmail.com I think the
"Supported Features” appendix of the docs should mention that as well.
The way I see it is that the standards defines two overloaded
JSON_QUERY functions, of which PostgreSQL will support only one.
In case of valid JSON, the implied CAST makes it look as though
the second variant of these function was supported as well but that
illusion totally falls apart once the JSON is not valid anymore.
I think it affects the following feature IDs:
- T821, Basic SQL/JSON query operators
For JSON_VALUE, JSON_TABLE and JSON_EXISTS
- T828, JSON_QUERY
Also, how hard would it be to add the functions that accept
character strings? Is there, besides the effort, any thing else
against it? I’m asking because I believe once released it might
never be changed — for backward compatibility.
[1]: /messages/by-id/CAKFQuwb50BAaj83Np+1O6xe3_T6DO8w2mxtFbgSbbUng+abrqA@mail.gmail.com
2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
a
----
[]
(1 row)As NULL ON EMPTY is implied, it should give the same result as
explicitly adding NULL ON EMPTY:17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON ERROR) a;
a
---(1 row)
Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
on the other hand returns NULL for both queries.I don’t think that PostgreSQL should follow Oracle DB's suit here
but again, in case this is intentional it should be made explicit
in the docs.This behavior is a bug and result of an unintentional change that I
made at some point after getting involved with this patch set. So I'm
going to fix this so that the empty results of jsonpath evaluation use
NULL ON EMPTY by default, ie, when the ON EMPTY clause is not present.
Attached a patch to do so.
Tested: works.
Thanks :)
-markus
po 17. 6. 2024 v 15:07 odesílatel Markus Winand <markus.winand@winand.at>
napsal:
On 17.06.2024, at 08:20, Amit Langote <amitlangote09@gmail.com> wrote:
Hi,
(apologies for not replying to this thread sooner)
On Tue, May 28, 2024 at 6:57 PM Pavel Stehule <pavel.stehule@gmail.com>
wrote:
út 28. 5. 2024 v 11:29 odesílatel Markus Winand <
markus.winand@winand.at> napsal:
Hi!
I’ve noticed two “surprising” (to me) behaviors related to
the “ON ERROR” clause of the new JSON query functions in 17beta1.1. JSON parsing errors are not subject to ON ERROR
Apparently, the functions expect JSONB so that a cast is implied
when providing TEXT. However, the errors during that cast are
not subject to the ON ERROR clause.17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
ERROR: invalid input syntax for type json
DETAIL: Token "invalid" is invalid.
CONTEXT: JSON data, line 1: invalidOracle DB and Db2 (LUW) both return NULL in that case.
I had a look on the list archive to see if that is intentional but
frankly speaking these functions came a long way. In case it is
intentional it might be worth adding a note to the docs.I remember a talk about this subject years ago. Originally the
JSON_QUERY was designed in similar like Oracle, and casting to jsonb was
done inside. If I remember this behave depends on the fact, so old SQL/JSON
has not json type and it was based just on processing of plain text. But
Postgres has JSON, and JSONB and then was more logical to use these types.
And because the JSON_QUERY uses these types, and the casting is done before
the execution of the function, then the clause ON ERROR cannot be handled.
Moreover, until soft errors Postgres didn't allow handling input errors in
common functions.I think so this difference should be mentioned in documentation.
Agree that the documentation needs to be clear about this. I'll update
my patch at [1] to add a note next to table 9.16.3. SQL/JSON Query
Functions.Considering another branch of this thread [1] I think the
"Supported Features” appendix of the docs should mention that as well.The way I see it is that the standards defines two overloaded
JSON_QUERY functions, of which PostgreSQL will support only one.
In case of valid JSON, the implied CAST makes it look as though
the second variant of these function was supported as well but that
illusion totally falls apart once the JSON is not valid anymore.I think it affects the following feature IDs:
- T821, Basic SQL/JSON query operators
For JSON_VALUE, JSON_TABLE and JSON_EXISTS
- T828, JSON_QUERYAlso, how hard would it be to add the functions that accept
character strings? Is there, besides the effort, any thing else
against it? I’m asking because I believe once released it might
never be changed — for backward compatibility.
It is easy to add the function that accepts text, but when you have
overloaded functions, then varchar or text is the preferred type, and then
the arguments will be casted to text by default instead of json. You can
have one function with argument of type "any", but then the
execution is a little bit slower (outer cast is faster than cast inside
function), and again the Postgres cannot deduce used argument types from
function's argument types.
Probably this can be solved if we introduce a new kind of type, where the
preferred type will be json, or jsonb.
So the problem is in the definition of implementation details about the
mechanism of type deduction (when you use string literal or when you use
string expression).
So now, when you will write json_table(x1 || x2), then and x1 and x2 are of
unknown type, then Postgres can know, so x1 and x2 will be jsonb, but when
there
will be secondary function json_table(text), then Postgres detects problem,
and use preferred type (that is text).
Generally, Postgres supports function overloading and it is working well
between text, int and numeric where constants have different syntax, but
when the constant
literal has the same syntax, there can be problems with hidden casts to
unwanted type, so not overloaded function is ideal. These issues can be
solved in the analysis stage by special code, but again it increases code
complexity.
Show quoted text
[1]
/messages/by-id/CAKFQuwb50BAaj83Np+1O6xe3_T6DO8w2mxtFbgSbbUng+abrqA@mail.gmail.com2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
a
----
[]
(1 row)As NULL ON EMPTY is implied, it should give the same result as
explicitly adding NULL ON EMPTY:17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY
ON ERROR) a;
a
---(1 row)
Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
on the other hand returns NULL for both queries.I don’t think that PostgreSQL should follow Oracle DB's suit here
but again, in case this is intentional it should be made explicit
in the docs.This behavior is a bug and result of an unintentional change that I
made at some point after getting involved with this patch set. So I'm
going to fix this so that the empty results of jsonpath evaluation use
NULL ON EMPTY by default, ie, when the ON EMPTY clause is not present.
Attached a patch to do so.Tested: works.
Thanks :)
-markus
On Mon, Jun 17, 2024 at 10:07 PM Markus Winand <markus.winand@winand.at> wrote:
On 17.06.2024, at 08:20, Amit Langote <amitlangote09@gmail.com> wrote:
2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
a
----
[]
(1 row)As NULL ON EMPTY is implied, it should give the same result as
explicitly adding NULL ON EMPTY:17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON ERROR) a;
a
---(1 row)
Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
on the other hand returns NULL for both queries.I don’t think that PostgreSQL should follow Oracle DB's suit here
but again, in case this is intentional it should be made explicit
in the docs.This behavior is a bug and result of an unintentional change that I
made at some point after getting involved with this patch set. So I'm
going to fix this so that the empty results of jsonpath evaluation use
NULL ON EMPTY by default, ie, when the ON EMPTY clause is not present.
Attached a patch to do so.Tested: works.
Pushed, thanks for testing.
I'll work on the documentation updates that may be needed based on
this and nearby discussion(s).
--
Thanks, Amit Langote
On Mon, Jun 17, 2024 at 9:07 PM Markus Winand <markus.winand@winand.at> wrote:
I think it affects the following feature IDs:
- T821, Basic SQL/JSON query operators
For JSON_VALUE, JSON_TABLE and JSON_EXISTS
- T828, JSON_QUERYAlso, how hard would it be to add the functions that accept
character strings? Is there, besides the effort, any thing else
against it? I’m asking because I believe once released it might
never be changed — for backward compatibility.
we have ExecEvalCoerceViaIOSafe, so it's doable.
I tried, but other things break. so it's not super easy i think.
because of eval_const_expressions_mutator, postgres will constantly
evaluate simple const expressions to simplify some expressions.
`SELECT JSON_QUERY('a', '$');`
postgres will try to do the cast coercion from text 'a' to jsonb. but
it will fail, but it's hard to send the cast failed information to
later code,
in ExecInterpExpr. in ExecEvalCoerceViaIOSafe, if we cast coercion
failed, then this function value is zero, isnull is set to true.
`SELECT JSON_QUERY('a', '$');`
will be equivalent to
`SELECT JSON_QUERY(NULL, '$');`
so making one of the next 2 examples to return jsonb 1 would be hard
to implement.
SELECT JSON_QUERY('a', '$' default '1' on empty);
SELECT JSON_QUERY('a', '$' default '1' on error);
--------------------------------------------------------------------------
If the standard says the context_item can be text string (cannot cast
to json successfully). future version we make it happen,
then it should be fine?
because it's like the previous version we are not fully compliant with
standard, now the new version is in full compliance with standard.
Hi,
On Mon, Jun 17, 2024 at 9:47 PM Chapman Flack <jcflack@acm.org> wrote:
On 06/17/24 02:20, Amit Langote wrote:
Apparently, the functions expect JSONB so that a cast is implied
when providing TEXT. However, the errors during that cast are
not subject to the ON ERROR clause.17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
ERROR: invalid input syntax for type json
DETAIL: Token "invalid" is invalid.
CONTEXT: JSON data, line 1: invalidOracle DB and Db2 (LUW) both return NULL in that case.
I wonder, could prosupport rewriting be used to detect that the first
argument is supplied by a cast, and rewrite the expression to apply the
cast 'softly'? Or would that behavior be too magical?
I don't think prosupport rewriting can be used, because JSON_QUERY().
We could possibly use "runtime coercion" for context_item so that the
coercion errors can be "caught", which is how we coerce the jsonpath
result to the RETURNING type.
For now, I'm inclined to simply document the limitation that errors
when coercing string arguments to json are always thrown.
--
Thanks, Amit Langote
Hi,
On Mon, Jun 17, 2024 at 10:07 PM Markus Winand <markus.winand@winand.at> wrote:
On 17.06.2024, at 08:20, Amit Langote <amitlangote09@gmail.com> wrote:
Agree that the documentation needs to be clear about this. I'll update
my patch at [1] to add a note next to table 9.16.3. SQL/JSON Query
Functions.Considering another branch of this thread [1] I think the
"Supported Features” appendix of the docs should mention that as well.The way I see it is that the standards defines two overloaded
JSON_QUERY functions, of which PostgreSQL will support only one.
In case of valid JSON, the implied CAST makes it look as though
the second variant of these function was supported as well but that
illusion totally falls apart once the JSON is not valid anymore.I think it affects the following feature IDs:
- T821, Basic SQL/JSON query operators
For JSON_VALUE, JSON_TABLE and JSON_EXISTS
- T828, JSON_QUERYAlso, how hard would it be to add the functions that accept
character strings? Is there, besides the effort, any thing else
against it? I’m asking because I believe once released it might
never be changed — for backward compatibility.
Hmm, I'm starting to think that adding the implied cast to json wasn't
such a great idea after all, because it might mislead the users to
think that JSON parsing is transparent (respects ON ERROR), which is
what you are saying, IIUC.
I'm inclined to push the attached patch which puts back the
restriction to allow only jsonb arguments, asking users to add an
explicit cast if necessary.
--
Thanks, Amit Langote
Attachments:
v1-0001-SQL-JSON-Only-allow-arguments-of-jsonb-type.patchapplication/octet-stream; name=v1-0001-SQL-JSON-Only-allow-arguments-of-jsonb-type.patchDownload
From 1bbb72929d80977b0f3d977a3e4f98d9786f78db Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlan@postgresql.org>
Date: Thu, 20 Jun 2024 17:49:56 +0900
Subject: [PATCH v1] SQL/JSON: Only allow arguments of jsonb type
Currently, functions JSON_QUERY(), JSON_VALUE(), JSON_EXISTS(),
and JSON_TABLE() allow users to specify arguments of character
string types or bytea which are implicitly cast to jsonb. Any
errors that may occur from such coercion are always thrown, which
Markus Winand pointed out do not get ON ERROR treatment.
So, instead require the users to cast the argument to jsonb to make
it clear that transparent JSON parsing is not supported.
Reported-by: Markus Winand
Discussion: https://postgr.es/m/F7DD1442-265C-4220-A603-CB0DEB77E91D%40winand.at
---
doc/src/sgml/func.sgml | 8 +++
src/backend/catalog/sql_features.txt | 4 +-
src/backend/parser/parse_expr.c | 16 +++---
.../regress/expected/sqljson_jsontable.out | 30 ++++++----
.../regress/expected/sqljson_queryfuncs.out | 57 ++++++++++---------
src/test/regress/sql/sqljson_jsontable.sql | 17 +++---
src/test/regress/sql/sqljson_queryfuncs.sql | 27 ++++-----
7 files changed, 90 insertions(+), 69 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4ebf4b3ed8..c1479b11f2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18671,6 +18671,14 @@ $.* ? (@ like_regex "^\\d+$")
<replaceable>path_expression</replaceable> can contain.
</para>
+ <note>
+ <para>
+ SQL/JSON paths can only be applied to the <type>jsonb</type> type, so it
+ might be necessary to cast the <replaceable>context_item</replaceable>
+ argument of these functions to <type>jsonb</type>.
+ </para>
+ </note>
+
<table id="functions-sqljson-querying">
<title>SQL/JSON Query Functions</title>
<tgroup cols="1">
diff --git a/src/backend/catalog/sql_features.txt b/src/backend/catalog/sql_features.txt
index c002f37202..cc9bf67a46 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -547,14 +547,14 @@ T811 Basic SQL/JSON constructor functions YES
T812 SQL/JSON: JSON_OBJECTAGG YES
T813 SQL/JSON: JSON_ARRAYAGG with ORDER BY YES
T814 Colon in JSON_OBJECT or JSON_OBJECTAGG YES
-T821 Basic SQL/JSON query operators YES
+T821 Basic SQL/JSON query operators YES only jsonb arguments supported
T822 SQL/JSON: IS JSON WITH UNIQUE KEYS predicate YES
T823 SQL/JSON: PASSING clause YES
T824 JSON_TABLE: specific PLAN clause NO
T825 SQL/JSON: ON EMPTY and ON ERROR clauses YES
T826 General value expression in ON ERROR or ON EMPTY clauses YES
T827 JSON_TABLE: sibling NESTED COLUMNS clauses YES
-T828 JSON_QUERY YES
+T828 JSON_QUERY YES only jsonb arguments supported
T829 JSON_QUERY: array wrapper options YES
T830 Enforcing unique keys in SQL/JSON constructor functions YES
T831 SQL/JSON path language: strict mode YES
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index a82621fa67..67e3730e5f 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -4265,29 +4265,23 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
JsonExpr *jsexpr;
Node *path_spec;
const char *func_name = NULL;
- JsonFormatType default_format;
switch (func->op)
{
case JSON_EXISTS_OP:
func_name = "JSON_EXISTS";
- default_format = JS_FORMAT_DEFAULT;
break;
case JSON_QUERY_OP:
func_name = "JSON_QUERY";
- default_format = JS_FORMAT_JSONB;
break;
case JSON_VALUE_OP:
func_name = "JSON_VALUE";
- default_format = JS_FORMAT_DEFAULT;
break;
case JSON_TABLE_OP:
func_name = "JSON_TABLE";
- default_format = JS_FORMAT_JSONB;
break;
default:
elog(ERROR, "invalid JsonFuncExpr op %d", (int) func->op);
- default_format = JS_FORMAT_DEFAULT; /* keep compiler quiet */
break;
}
@@ -4330,9 +4324,15 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
*/
jsexpr->formatted_expr = transformJsonValueExpr(pstate, func_name,
func->context_item,
- default_format,
- JSONBOID,
+ JS_FORMAT_DEFAULT,
+ InvalidOid,
false);
+ if (exprType(jsexpr->formatted_expr) != JSONBOID)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("%s() only accepts arguments of jsonb type", func_name),
+ errhint("Try casting the argument to jsonb"),
+ parser_errposition(pstate, func->location)));
jsexpr->format = func->context_item->format;
path_spec = transformExprRecurse(pstate, func->pathspec);
diff --git a/src/test/regress/expected/sqljson_jsontable.out b/src/test/regress/expected/sqljson_jsontable.out
index 3b70ed9548..c3fc5c0ba9 100644
--- a/src/test/regress/expected/sqljson_jsontable.out
+++ b/src/test/regress/expected/sqljson_jsontable.out
@@ -1,26 +1,26 @@
-- JSON_TABLE
-- Should fail (JSON_TABLE can be used only in FROM clause)
-SELECT JSON_TABLE('[]', '$');
+SELECT JSON_TABLE(jsonb '[]', '$');
ERROR: syntax error at or near "("
-LINE 1: SELECT JSON_TABLE('[]', '$');
+LINE 1: SELECT JSON_TABLE(jsonb '[]', '$');
^
-- Only allow EMPTY and ERROR for ON ERROR
-SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') DEFAULT 1 ON ERROR);
+SELECT * FROM JSON_TABLE(jsonb '[]', 'strict $.a' COLUMNS (js2 int PATH '$') DEFAULT 1 ON ERROR);
ERROR: invalid ON ERROR behavior
-LINE 1: ...BLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') DEFAULT 1 ...
+LINE 1: ...onb '[]', 'strict $.a' COLUMNS (js2 int PATH '$') DEFAULT 1 ...
^
DETAIL: Only EMPTY or ERROR is allowed in the top-level ON ERROR clause.
-SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') NULL ON ERROR);
+SELECT * FROM JSON_TABLE(jsonb '[]', 'strict $.a' COLUMNS (js2 int PATH '$') NULL ON ERROR);
ERROR: invalid ON ERROR behavior
-LINE 1: ...BLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') NULL ON ER...
+LINE 1: ...onb '[]', 'strict $.a' COLUMNS (js2 int PATH '$') NULL ON ER...
^
DETAIL: Only EMPTY or ERROR is allowed in the top-level ON ERROR clause.
-SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') EMPTY ON ERROR);
+SELECT * FROM JSON_TABLE(jsonb '[]', 'strict $.a' COLUMNS (js2 int PATH '$') EMPTY ON ERROR);
js2
-----
(0 rows)
-SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') ERROR ON ERROR);
+SELECT * FROM JSON_TABLE(jsonb '[]', 'strict $.a' COLUMNS (js2 int PATH '$') ERROR ON ERROR);
ERROR: jsonpath member accessor can only be applied to an object
-- Column and path names must be distinct
SELECT * FROM JSON_TABLE(jsonb'"1.23"', '$.a' as js2 COLUMNS (js2 int path '$'));
@@ -28,10 +28,10 @@ ERROR: duplicate JSON_TABLE column or path name: js2
LINE 1: ...M JSON_TABLE(jsonb'"1.23"', '$.a' as js2 COLUMNS (js2 int pa...
^
-- Should fail (no columns)
-SELECT * FROM JSON_TABLE(NULL, '$' COLUMNS ());
+SELECT * FROM JSON_TABLE(NULL::jsonb, '$' COLUMNS ());
ERROR: syntax error at or near ")"
-LINE 1: SELECT * FROM JSON_TABLE(NULL, '$' COLUMNS ());
- ^
+LINE 1: SELECT * FROM JSON_TABLE(NULL::jsonb, '$' COLUMNS ());
+ ^
SELECT * FROM JSON_TABLE (NULL::jsonb, '$' COLUMNS (v1 timestamp)) AS f (v1, v2);
ERROR: JSON_TABLE function has 1 columns available but 2 columns specified
--duplicated column name
@@ -670,7 +670,7 @@ ERROR: only string constants are supported in JSON_TABLE path specification
LINE 1: SELECT * FROM JSON_TABLE(jsonb '{"a": 123}', '$' || '.' || '...
^
-- JsonPathQuery() error message mentioning column name
-SELECT * FROM JSON_TABLE('{"a": [{"b": "1"}, {"b": "2"}]}', '$' COLUMNS (b json path '$.a[*].b' ERROR ON ERROR));
+SELECT * FROM JSON_TABLE(jsonb '{"a": [{"b": "1"}, {"b": "2"}]}', '$' COLUMNS (b json path '$.a[*].b' ERROR ON ERROR));
ERROR: JSON path expression for column "b" should return single item without wrapper
HINT: Use WITH WRAPPER clause to wrap SQL/JSON items into array.
-- JSON_TABLE: nested paths
@@ -1067,3 +1067,9 @@ CREATE OR REPLACE VIEW public.jsonb_table_view7 AS
) sub
DROP VIEW jsonb_table_view7;
DROP TABLE s;
+-- Only jsonb arguments are supported currently
+SELECT * FROM JSON_TABLE('[]', '$' COLUMNS(a int));
+ERROR: JSON_TABLE() only accepts arguments of jsonb type
+LINE 1: SELECT * FROM JSON_TABLE('[]', '$' COLUMNS(a int));
+ ^
+HINT: Try casting the argument to jsonb
diff --git a/src/test/regress/expected/sqljson_queryfuncs.out b/src/test/regress/expected/sqljson_queryfuncs.out
index 6ca5674cef..df02aa08ef 100644
--- a/src/test/regress/expected/sqljson_queryfuncs.out
+++ b/src/test/regress/expected/sqljson_queryfuncs.out
@@ -532,9 +532,9 @@ SELECT JSON_VALUE(jsonb 'null', '$ts' PASSING timestamptz '2018-02-21 12:34:56 +
(1 row)
-- Test that numeric JSON values are coerced uniformly
-select json_value('{"a": 1.234}', '$.a' returning int error on error);
+select json_value(jsonb '{"a": 1.234}', '$.a' returning int error on error);
ERROR: invalid input syntax for type integer: "1.234"
-select json_value('{"a": "1.234"}', '$.a' returning int error on error);
+select json_value(jsonb '{"a": "1.234"}', '$.a' returning int error on error);
ERROR: invalid input syntax for type integer: "1.234"
-- JSON_QUERY
SELECT JSON_VALUE(NULL::jsonb, '$');
@@ -1258,36 +1258,36 @@ SELECT JSON_VALUE(jsonb '{"d1": "foo"}', '$.a2' RETURNING queryfuncs_test_domain
-- Check the cases where a coercion-related expression is masking an
-- unsupported expressions
-- CoerceViaIO
-SELECT JSON_QUERY('"a"', '$.a' RETURNING int DEFAULT (SELECT '"1"')::jsonb ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING int DEFAULT (SELECT '"1"')::jsonb ON ERROR);
ERROR: can only specify a constant, non-aggregate function, or operator expression for DEFAULT
-LINE 1: ...CT JSON_QUERY('"a"', '$.a' RETURNING int DEFAULT (SELECT '"...
+LINE 1: ...N_QUERY(jsonb '"a"', '$.a' RETURNING int DEFAULT (SELECT '"...
^
-- CoerceToDomain
-SELECT JSON_QUERY('"a"', '$.a' RETURNING queryfuncs_test_domain DEFAULT (select '"1"')::queryfuncs_test_domain ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING queryfuncs_test_domain DEFAULT (select '"1"')::queryfuncs_test_domain ON ERROR);
ERROR: can only specify a constant, non-aggregate function, or operator expression for DEFAULT
LINE 1: ...', '$.a' RETURNING queryfuncs_test_domain DEFAULT (select '"...
^
-- RelabelType
-SELECT JSON_QUERY('"a"', '$.a' RETURNING int DEFAULT (SELECT 1)::oid::int ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING int DEFAULT (SELECT 1)::oid::int ON ERROR);
ERROR: can only specify a constant, non-aggregate function, or operator expression for DEFAULT
-LINE 1: ...CT JSON_QUERY('"a"', '$.a' RETURNING int DEFAULT (SELECT 1)...
+LINE 1: ...N_QUERY(jsonb '"a"', '$.a' RETURNING int DEFAULT (SELECT 1)...
^
-- ArrayCoerceExpr
-SELECT JSON_QUERY('"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{1}')::oid[]::int[] ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{1}')::oid[]::int[] ON ERROR);
ERROR: can only specify a constant, non-aggregate function, or operator expression for DEFAULT
-LINE 1: ... JSON_QUERY('"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{...
+LINE 1: ...QUERY(jsonb '"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{...
^
-- CollateExpr
-SELECT JSON_QUERY('"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{1}')::text COLLATE "C" ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{1}')::text COLLATE "C" ON ERROR);
ERROR: can only specify a constant, non-aggregate function, or operator expression for DEFAULT
-LINE 1: ... JSON_QUERY('"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{...
+LINE 1: ...QUERY(jsonb '"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{...
^
-- ConvertRowtypeExpr
CREATE TABLE someparent (a int);
CREATE TABLE somechild () INHERITS (someparent);
-SELECT JSON_QUERY('"a"', '$.a' RETURNING someparent DEFAULT (SELECT '(1)')::somechild::someparent ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING someparent DEFAULT (SELECT '(1)')::somechild::someparent ON ERROR);
ERROR: can only specify a constant, non-aggregate function, or operator expression for DEFAULT
-LINE 1: ..._QUERY('"a"', '$.a' RETURNING someparent DEFAULT (SELECT '(...
+LINE 1: ...(jsonb '"a"', '$.a' RETURNING someparent DEFAULT (SELECT '(...
^
DROP DOMAIN queryfuncs_test_domain;
DROP TABLE someparent, somechild;
@@ -1325,22 +1325,9 @@ SELECT JSON_QUERY(jsonb '{"a": 123}', '$' || '.' || 'a' WITH WRAPPER);
-- Should fail (invalid path)
SELECT JSON_QUERY(jsonb '{"a": 123}', 'error' || ' ' || 'error');
ERROR: syntax error at or near " " of jsonpath input
--- Non-jsonb inputs automatically coerced to jsonb
-SELECT JSON_EXISTS(json '{"a": 123}', '$' || '.' || 'a');
- json_exists
--------------
- t
-(1 row)
-
-SELECT JSON_QUERY(NULL FORMAT JSON, '$');
- json_query
-------------
-
-(1 row)
-
-- Test non-const jsonpath
CREATE TEMP TABLE jsonpaths (path) AS SELECT '$';
-SELECT json_value('"aaa"', path RETURNING json) FROM jsonpaths;
+SELECT json_value(jsonb '"aaa"', path RETURNING json) FROM jsonpaths;
json_value
------------
"aaa"
@@ -1357,3 +1344,19 @@ SELECT JSON_QUERY(jsonb 'null', '$xyz' PASSING 1 AS xyz);
1
(1 row)
+-- Only jsonb arguments are supported currently
+SELECT JSON_EXISTS(NULL, '$');
+ERROR: JSON_EXISTS() only accepts arguments of jsonb type
+LINE 1: SELECT JSON_EXISTS(NULL, '$');
+ ^
+HINT: Try casting the argument to jsonb
+SELECT JSON_QUERY('1', '$');
+ERROR: JSON_QUERY() only accepts arguments of jsonb type
+LINE 1: SELECT JSON_QUERY('1', '$');
+ ^
+HINT: Try casting the argument to jsonb
+SELECT JSON_VALUE('aaa', '$');
+ERROR: JSON_VALUE() only accepts arguments of jsonb type
+LINE 1: SELECT JSON_VALUE('aaa', '$');
+ ^
+HINT: Try casting the argument to jsonb
diff --git a/src/test/regress/sql/sqljson_jsontable.sql b/src/test/regress/sql/sqljson_jsontable.sql
index a1f924146e..c956f90c5e 100644
--- a/src/test/regress/sql/sqljson_jsontable.sql
+++ b/src/test/regress/sql/sqljson_jsontable.sql
@@ -1,19 +1,19 @@
-- JSON_TABLE
-- Should fail (JSON_TABLE can be used only in FROM clause)
-SELECT JSON_TABLE('[]', '$');
+SELECT JSON_TABLE(jsonb '[]', '$');
-- Only allow EMPTY and ERROR for ON ERROR
-SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') DEFAULT 1 ON ERROR);
-SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') NULL ON ERROR);
-SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') EMPTY ON ERROR);
-SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') ERROR ON ERROR);
+SELECT * FROM JSON_TABLE(jsonb '[]', 'strict $.a' COLUMNS (js2 int PATH '$') DEFAULT 1 ON ERROR);
+SELECT * FROM JSON_TABLE(jsonb '[]', 'strict $.a' COLUMNS (js2 int PATH '$') NULL ON ERROR);
+SELECT * FROM JSON_TABLE(jsonb '[]', 'strict $.a' COLUMNS (js2 int PATH '$') EMPTY ON ERROR);
+SELECT * FROM JSON_TABLE(jsonb '[]', 'strict $.a' COLUMNS (js2 int PATH '$') ERROR ON ERROR);
-- Column and path names must be distinct
SELECT * FROM JSON_TABLE(jsonb'"1.23"', '$.a' as js2 COLUMNS (js2 int path '$'));
-- Should fail (no columns)
-SELECT * FROM JSON_TABLE(NULL, '$' COLUMNS ());
+SELECT * FROM JSON_TABLE(NULL::jsonb, '$' COLUMNS ());
SELECT * FROM JSON_TABLE (NULL::jsonb, '$' COLUMNS (v1 timestamp)) AS f (v1, v2);
@@ -307,7 +307,7 @@ FROM JSON_TABLE(
SELECT * FROM JSON_TABLE(jsonb '{"a": 123}', '$' || '.' || 'a' COLUMNS (foo int));
-- JsonPathQuery() error message mentioning column name
-SELECT * FROM JSON_TABLE('{"a": [{"b": "1"}, {"b": "2"}]}', '$' COLUMNS (b json path '$.a[*].b' ERROR ON ERROR));
+SELECT * FROM JSON_TABLE(jsonb '{"a": [{"b": "1"}, {"b": "2"}]}', '$' COLUMNS (b json path '$.a[*].b' ERROR ON ERROR));
-- JSON_TABLE: nested paths
@@ -518,3 +518,6 @@ SELECT sub.* FROM s,
\sv jsonb_table_view7
DROP VIEW jsonb_table_view7;
DROP TABLE s;
+
+-- Only jsonb arguments are supported currently
+SELECT * FROM JSON_TABLE('[]', '$' COLUMNS(a int));
diff --git a/src/test/regress/sql/sqljson_queryfuncs.sql b/src/test/regress/sql/sqljson_queryfuncs.sql
index 919ab7e651..fdd6903f7f 100644
--- a/src/test/regress/sql/sqljson_queryfuncs.sql
+++ b/src/test/regress/sql/sqljson_queryfuncs.sql
@@ -138,8 +138,8 @@ SELECT JSON_VALUE(jsonb 'null', '$ts' PASSING timestamptz '2018-02-21 12:34:56 +
SELECT JSON_VALUE(jsonb 'null', '$ts' PASSING timestamptz '2018-02-21 12:34:56 +10' AS ts RETURNING jsonb);
-- Test that numeric JSON values are coerced uniformly
-select json_value('{"a": 1.234}', '$.a' returning int error on error);
-select json_value('{"a": "1.234"}', '$.a' returning int error on error);
+select json_value(jsonb '{"a": 1.234}', '$.a' returning int error on error);
+select json_value(jsonb '{"a": "1.234"}', '$.a' returning int error on error);
-- JSON_QUERY
@@ -423,19 +423,19 @@ SELECT JSON_VALUE(jsonb '{"d1": "foo"}', '$.a2' RETURNING queryfuncs_test_domain
-- unsupported expressions
-- CoerceViaIO
-SELECT JSON_QUERY('"a"', '$.a' RETURNING int DEFAULT (SELECT '"1"')::jsonb ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING int DEFAULT (SELECT '"1"')::jsonb ON ERROR);
-- CoerceToDomain
-SELECT JSON_QUERY('"a"', '$.a' RETURNING queryfuncs_test_domain DEFAULT (select '"1"')::queryfuncs_test_domain ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING queryfuncs_test_domain DEFAULT (select '"1"')::queryfuncs_test_domain ON ERROR);
-- RelabelType
-SELECT JSON_QUERY('"a"', '$.a' RETURNING int DEFAULT (SELECT 1)::oid::int ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING int DEFAULT (SELECT 1)::oid::int ON ERROR);
-- ArrayCoerceExpr
-SELECT JSON_QUERY('"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{1}')::oid[]::int[] ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{1}')::oid[]::int[] ON ERROR);
-- CollateExpr
-SELECT JSON_QUERY('"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{1}')::text COLLATE "C" ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING int[] DEFAULT (SELECT '{1}')::text COLLATE "C" ON ERROR);
-- ConvertRowtypeExpr
CREATE TABLE someparent (a int);
CREATE TABLE somechild () INHERITS (someparent);
-SELECT JSON_QUERY('"a"', '$.a' RETURNING someparent DEFAULT (SELECT '(1)')::somechild::someparent ON ERROR);
+SELECT JSON_QUERY(jsonb '"a"', '$.a' RETURNING someparent DEFAULT (SELECT '(1)')::somechild::someparent ON ERROR);
DROP DOMAIN queryfuncs_test_domain;
DROP TABLE someparent, somechild;
@@ -449,15 +449,16 @@ SELECT JSON_QUERY(jsonb '{"a": 123}', '$' || '.' || 'a' WITH WRAPPER);
-- Should fail (invalid path)
SELECT JSON_QUERY(jsonb '{"a": 123}', 'error' || ' ' || 'error');
--- Non-jsonb inputs automatically coerced to jsonb
-SELECT JSON_EXISTS(json '{"a": 123}', '$' || '.' || 'a');
-SELECT JSON_QUERY(NULL FORMAT JSON, '$');
-
-- Test non-const jsonpath
CREATE TEMP TABLE jsonpaths (path) AS SELECT '$';
-SELECT json_value('"aaa"', path RETURNING json) FROM jsonpaths;
+SELECT json_value(jsonb '"aaa"', path RETURNING json) FROM jsonpaths;
-- Test PASSING argument parsing
SELECT JSON_QUERY(jsonb 'null', '$xyz' PASSING 1 AS xy);
SELECT JSON_QUERY(jsonb 'null', '$xy' PASSING 1 AS xyz);
SELECT JSON_QUERY(jsonb 'null', '$xyz' PASSING 1 AS xyz);
+
+-- Only jsonb arguments are supported currently
+SELECT JSON_EXISTS(NULL, '$');
+SELECT JSON_QUERY('1', '$');
+SELECT JSON_VALUE('aaa', '$');
--
2.43.0
On Thu, Jun 20, 2024 at 2:19 AM Amit Langote <amitlangote09@gmail.com>
wrote:
Hi,
On Mon, Jun 17, 2024 at 10:07 PM Markus Winand <markus.winand@winand.at>
wrote:On 17.06.2024, at 08:20, Amit Langote <amitlangote09@gmail.com> wrote:
Agree that the documentation needs to be clear about this. I'll update
my patch at [1] to add a note next to table 9.16.3. SQL/JSON Query
Functions.Considering another branch of this thread [1] I think the
"Supported Features” appendix of the docs should mention that as well.The way I see it is that the standards defines two overloaded
JSON_QUERY functions, of which PostgreSQL will support only one.
In case of valid JSON, the implied CAST makes it look as though
the second variant of these function was supported as well but that
illusion totally falls apart once the JSON is not valid anymore.I think it affects the following feature IDs:
- T821, Basic SQL/JSON query operators
For JSON_VALUE, JSON_TABLE and JSON_EXISTS
- T828, JSON_QUERYAlso, how hard would it be to add the functions that accept
character strings? Is there, besides the effort, any thing else
against it? I’m asking because I believe once released it might
never be changed — for backward compatibility.Hmm, I'm starting to think that adding the implied cast to json wasn't
such a great idea after all, because it might mislead the users to
think that JSON parsing is transparent (respects ON ERROR), which is
what you are saying, IIUC.
Actually, the implied cast is exactly the correct thing to do here - the
issue is that we aren't using the newly added soft errors infrastructure
[1]: d9f7f5d32f201bec61fef8104aafcb77cecb4dcb
error tells us to produce. This seems to be in the realm of doability so
we should try in the interest of being standard conforming. I'd even argue
to make this an open item unless and until the attempt is agreed upon to
have failed (or it succeeds of course).
David J.
[1]: d9f7f5d32f201bec61fef8104aafcb77cecb4dcb
On Fri, Jun 21, 2024 at 1:11 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Thu, Jun 20, 2024 at 2:19 AM Amit Langote <amitlangote09@gmail.com> wrote:
On Mon, Jun 17, 2024 at 10:07 PM Markus Winand <markus.winand@winand.at> wrote:
On 17.06.2024, at 08:20, Amit Langote <amitlangote09@gmail.com> wrote:
Agree that the documentation needs to be clear about this. I'll update
my patch at [1] to add a note next to table 9.16.3. SQL/JSON Query
Functions.Considering another branch of this thread [1] I think the
"Supported Features” appendix of the docs should mention that as well.The way I see it is that the standards defines two overloaded
JSON_QUERY functions, of which PostgreSQL will support only one.
In case of valid JSON, the implied CAST makes it look as though
the second variant of these function was supported as well but that
illusion totally falls apart once the JSON is not valid anymore.I think it affects the following feature IDs:
- T821, Basic SQL/JSON query operators
For JSON_VALUE, JSON_TABLE and JSON_EXISTS
- T828, JSON_QUERYAlso, how hard would it be to add the functions that accept
character strings? Is there, besides the effort, any thing else
against it? I’m asking because I believe once released it might
never be changed — for backward compatibility.Hmm, I'm starting to think that adding the implied cast to json wasn't
such a great idea after all, because it might mislead the users to
think that JSON parsing is transparent (respects ON ERROR), which is
what you are saying, IIUC.Actually, the implied cast is exactly the correct thing to do here - the issue is that we aren't using the newly added soft errors infrastructure [1] to catch the result of that cast and instead produce whatever output on error tells us to produce. This seems to be in the realm of doability so we should try in the interest of being standard conforming.
Soft error handling *was* used for catching cast errors in the very
early versions of this patch (long before I got involved and the
infrastructure you mention got added). It was taken out after Pavel
said [1]/messages/by-id/CAFj8pRCnzO2cnHi5ebXciV=tuGVvAQOW9uPU+DQV1GkL31R=-g@mail.gmail.com that he didn't like producing NULL instead of throwing an
error. Not sure if Pavel's around but it would be good to know why he
didn't like it at the time.
I can look into making that work again, but that is not going to make beta2.
I'd even argue to make this an open item unless and until the attempt is agreed upon to have failed (or it succeeds of course).
OK, adding an open item.
--
Thanks, Amit Langote
[1]: /messages/by-id/CAFj8pRCnzO2cnHi5ebXciV=tuGVvAQOW9uPU+DQV1GkL31R=-g@mail.gmail.com
On Thu, Jun 20, 2024 at 5:22 PM Amit Langote <amitlangote09@gmail.com>
wrote:
Soft error handling *was* used for catching cast errors in the very
early versions of this patch (long before I got involved and the
infrastructure you mention got added). It was taken out after Pavel
said [1] that he didn't like producing NULL instead of throwing an
error. Not sure if Pavel's around but it would be good to know why he
didn't like it at the time.
I'm personally in the "make it error" camp but "make it conform to the
standard" is a stronger membership (in general).
I see this note in your linked thread:
By the standard, it is implementation-defined whether JSON parsing errors
should be caught by ON ERROR clause.
Absent someone contradicting that claim I retract my position here and am
fine with failing if these "functions" are supplied with something that
cannot be cast to json. I'd document them like functions that accept json
with the implications that any casting to json happens before the function
is called and thus its arguments do not apply to that step.
Given that we have "expression IS JSON" the ability to hack together a case
expression to get non-erroring behavior exists.
David J.
On Fri, Jun 21, 2024 at 10:01 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Thu, Jun 20, 2024 at 5:22 PM Amit Langote <amitlangote09@gmail.com> wrote:
Soft error handling *was* used for catching cast errors in the very
early versions of this patch (long before I got involved and the
infrastructure you mention got added). It was taken out after Pavel
said [1] that he didn't like producing NULL instead of throwing an
error. Not sure if Pavel's around but it would be good to know why he
didn't like it at the time.I'm personally in the "make it error" camp but "make it conform to the standard" is a stronger membership (in general).
I see this note in your linked thread:
By the standard, it is implementation-defined whether JSON parsing errors
should be caught by ON ERROR clause.Absent someone contradicting that claim I retract my position here and am fine with failing if these "functions" are supplied with something that cannot be cast to json. I'd document them like functions that accept json with the implications that any casting to json happens before the function is called and thus its arguments do not apply to that step.
Thanks for that clarification.
So, there are the following options:
1. Disallow anything but jsonb for context_item (the patch I posted yesterday)
2. Continue allowing context_item to be non-json character or utf-8
encoded bytea strings, but document that any parsing errors do not
respect the ON ERROR clause.
3. Go ahead and fix implicit casts to jsonb so that any parsing errors
respect ON ERROR (no patch written yet).
David's vote seems to be 2, which is my inclination too. Markus' vote
seems to be either 1 or 3. Anyone else?
--
Thanks, Amit Langote
pá 21. 6. 2024 v 2:22 odesílatel Amit Langote <amitlangote09@gmail.com>
napsal:
On Fri, Jun 21, 2024 at 1:11 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Thu, Jun 20, 2024 at 2:19 AM Amit Langote <amitlangote09@gmail.com>
wrote:
On Mon, Jun 17, 2024 at 10:07 PM Markus Winand <markus.winand@winand.at>
wrote:
On 17.06.2024, at 08:20, Amit Langote <amitlangote09@gmail.com>
wrote:
Agree that the documentation needs to be clear about this. I'll
update
my patch at [1] to add a note next to table 9.16.3. SQL/JSON Query
Functions.Considering another branch of this thread [1] I think the
"Supported Features” appendix of the docs should mention that as well.The way I see it is that the standards defines two overloaded
JSON_QUERY functions, of which PostgreSQL will support only one.
In case of valid JSON, the implied CAST makes it look as though
the second variant of these function was supported as well but that
illusion totally falls apart once the JSON is not valid anymore.I think it affects the following feature IDs:
- T821, Basic SQL/JSON query operators
For JSON_VALUE, JSON_TABLE and JSON_EXISTS
- T828, JSON_QUERYAlso, how hard would it be to add the functions that accept
character strings? Is there, besides the effort, any thing else
against it? I’m asking because I believe once released it might
never be changed — for backward compatibility.Hmm, I'm starting to think that adding the implied cast to json wasn't
such a great idea after all, because it might mislead the users to
think that JSON parsing is transparent (respects ON ERROR), which is
what you are saying, IIUC.Actually, the implied cast is exactly the correct thing to do here - the
issue is that we aren't using the newly added soft errors infrastructure
[1] to catch the result of that cast and instead produce whatever output on
error tells us to produce. This seems to be in the realm of doability so
we should try in the interest of being standard conforming.Soft error handling *was* used for catching cast errors in the very
early versions of this patch (long before I got involved and the
infrastructure you mention got added). It was taken out after Pavel
said [1] that he didn't like producing NULL instead of throwing an
error. Not sure if Pavel's around but it would be good to know why he
didn't like it at the time.I can look into making that work again, but that is not going to make
beta2.I'd even argue to make this an open item unless and until the attempt
is agreed upon to have failed (or it succeeds of course).
OK, adding an open item.
At this time, when I wrote this mail, I didn't exactly notice the standard,
so broken format should be handled there too. In this time, there was no
support for soft errors ever in Postgres, so handling broken formats was
inconsistent.
Standard describes format errors, but exactly doesn't describe if this is
error like missing key or broken json format. Maybe wrongly, but
intuitively for me, these errors are of different kinds and broken input
data is a different case than missing key (but fully valid json). I didn't
find the exact sentence in standard when I searched it (but it was four
years ago).
My position in this case is not extra strong. The original patch was
written and tested to be compatible with Oracle (what is a strong argument
and feature). On second hand, some things required subtransactioning what
was wrong (soft errors were introduced later). The compatibility with
Oracle is a strong argument, but Oracle by itself is not fully compatible
with standard, and some cases are special (in Oracle) because empty string
in Oracle is NULL, and then it is handled differently. In this time I had
motivation to reduce the patch to "safe" minimum to be possible to accept
it by committers. The patch was written in 2017 (I think). Handling broken
format (input format) was one issue that I thought could be solved later.
The main reason for my mail is fact, so Postgres and Oracle have DIFFERENT
correct format of JSON!
'{a:10}' is correct on Oracle, but not correct on Postgres. And with
default ON ERROR NULL (what is default), then the Oracle returns 10, and
Postgres NULL. I thought this can be very messy and better to just raise an
exception.
Regards
Pavel
Show quoted text
--
Thanks, Amit Langote
[1]
/messages/by-id/CAFj8pRCnzO2cnHi5ebXciV=tuGVvAQOW9uPU+DQV1GkL31R=-g@mail.gmail.com
pá 21. 6. 2024 v 6:01 odesílatel Amit Langote <amitlangote09@gmail.com>
napsal:
On Fri, Jun 21, 2024 at 10:01 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Thu, Jun 20, 2024 at 5:22 PM Amit Langote <amitlangote09@gmail.com>
wrote:
Soft error handling *was* used for catching cast errors in the very
early versions of this patch (long before I got involved and the
infrastructure you mention got added). It was taken out after Pavel
said [1] that he didn't like producing NULL instead of throwing an
error. Not sure if Pavel's around but it would be good to know why he
didn't like it at the time.I'm personally in the "make it error" camp but "make it conform to the
standard" is a stronger membership (in general).
I see this note in your linked thread:
By the standard, it is implementation-defined whether JSON parsing
errors
should be caught by ON ERROR clause.
Absent someone contradicting that claim I retract my position here and
am fine with failing if these "functions" are supplied with something that
cannot be cast to json. I'd document them like functions that accept json
with the implications that any casting to json happens before the function
is called and thus its arguments do not apply to that step.Thanks for that clarification.
So, there are the following options:
1. Disallow anything but jsonb for context_item (the patch I posted
yesterday)2. Continue allowing context_item to be non-json character or utf-8
encoded bytea strings, but document that any parsing errors do not
respect the ON ERROR clause.3. Go ahead and fix implicit casts to jsonb so that any parsing errors
respect ON ERROR (no patch written yet).David's vote seems to be 2, which is my inclination too. Markus' vote
seems to be either 1 or 3. Anyone else?
@3 can be possibly messy (although be near Oracle or standard). I don't
think it is safe - one example '{a:10}' is valid for Oracle, but not for
Postgres, and using @3 impacts different results (better to raise an
exception).
The effect of @1 and @2 is similar - @1 is better so the user needs to
explicitly cast, so maybe it is cleaner, so the cast should not be handled,
@2 is more user friendly, because it accepts unknown string literal. From a
developer perspective I prefer @1, from a user perspective I prefer @2.
Maybe @2 is a good compromise.
Regards
Pavel
Show quoted text
--
Thanks, Amit Langote
On Thursday, June 20, 2024, Pavel Stehule <pavel.stehule@gmail.com> wrote:
pá 21. 6. 2024 v 6:01 odesílatel Amit Langote <amitlangote09@gmail.com>
napsal:On Fri, Jun 21, 2024 at 10:01 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:By the standard, it is implementation-defined whether JSON parsing
errors
should be caught by ON ERROR clause.
Absent someone contradicting that claim I retract my position here and
am fine with failing if these "functions" are supplied with something that
cannot be cast to json. I'd document them like functions that accept json
with the implications that any casting to json happens before the function
is called and thus its arguments do not apply to that step.Thanks for that clarification.
So, there are the following options:
1. Disallow anything but jsonb for context_item (the patch I posted
yesterday)2. Continue allowing context_item to be non-json character or utf-8
encoded bytea strings, but document that any parsing errors do not
respect the ON ERROR clause.3. Go ahead and fix implicit casts to jsonb so that any parsing errors
respect ON ERROR (no patch written yet).David's vote seems to be 2, which is my inclination too. Markus' vote
seems to be either 1 or 3. Anyone else?@3 can be possibly messy (although be near Oracle or standard). I don't
think it is safe - one example '{a:10}' is valid for Oracle, but not for
Postgres, and using @3 impacts different results (better to raise an
exception).The effect of @1 and @2 is similar - @1 is better so the user needs to
explicitly cast, so maybe it is cleaner, so the cast should not be handled,
@2 is more user friendly, because it accepts unknown string literal. From a
developer perspective I prefer @1, from a user perspective I prefer @2.
Maybe @2 is a good compromise.
2 also has the benefit of being standard conforming while 1 does not.
3 is also conforming and I wouldn’t object to it had we already done it
that way.
But since 2 is conforming too and implemented, and we are in beta, I'm
thinking we need to go with this option.
David J.
On 21.06.2024, at 03:00, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Thu, Jun 20, 2024 at 5:22 PM Amit Langote <amitlangote09@gmail.com> wrote:
Soft error handling *was* used for catching cast errors in the very
early versions of this patch (long before I got involved and the
infrastructure you mention got added). It was taken out after Pavel
said [1] that he didn't like producing NULL instead of throwing an
error. Not sure if Pavel's around but it would be good to know why he
didn't like it at the time.I'm personally in the "make it error" camp but "make it conform to the standard" is a stronger membership (in general).
I see this note in your linked thread:
By the standard, it is implementation-defined whether JSON parsing errors
should be caught by ON ERROR clause.Absent someone contradicting that claim I retract my position here and am fine with failing if these "functions" are supplied with something that cannot be cast to json. I'd document them like functions that accept json with the implications that any casting to json happens before the function is called and thus its arguments do not apply to that step.
That claim was also made in 2020, before the current (2023)
SQL standard was released — yet it might have been the same.
My understanding of the 2023 standard is that ON ERROR
covers invalid JSON because the conversion from a character
string to JSON is done deeply nested inside the JSON_QUERY &
Co functions.
9.47 Processing <JSON API common syntax> Function GR 3
triggers
9.46, “SQL/JSON path language: syntax and semantics”
Where GR 11 says:
————
GR 11) The result of evaluating a <JSON path wff> is a completion condition, and, if that completion condition is successful completion (00000), then an SQL/JSON sequence. For conciseness, the result will be stated either as an exception condition or as an SQL/JSON sequence (in the latter case, the completion condition successful completion (00000) is implicit). Unsuccessful completion conditions are not automatically raised and do not terminate application of the General Rules in this Subclause.
a) If <JSON path context variable> JPCV is specified, then
Case:
• i) If PARSED is True, then the result of evaluating JPCV is JT.
• ii) If the declared type of JT is JSON, then the result of evaluating JPCV is JT.
• iii) Otherwise:
• 1) The General Rules of Subclause 9.42, “Parsing JSON text”, are applied with JT as JSON TEXT, an implementation-defined (IV185) <JSON key uniqueness constraint> as UNIQUENESS CONSTRAINT, and FO as FORMAT OPTION; let ST be the STATUS and let CISJI be the SQL/JSON ITEM returned from the application of those General Rules.
• 2) Case:
• A) If ST is not successful completion (00000), then the result of evaluating JPCV is ST.
• B) Otherwise, the result of evaluating JPCV is CISJI.
————
In case of an exception, it is passed along to clause 9.44 Converting an SQL/JSON sequence to an SQL/JSON item where GR 5b ultimately says (the exception is in TEMPST in the meanwhile):
——
• b) If TEMPST is an exception condition, then Case:
i) If ONERROR is ERROR, then let OUTST be TEMPST.
ii) Otherwise, let OUTST be successful completion (00000). Case:
• 1) If ONERROR is NULL, then let JV be the null value.
• 2) If ONERROR is EMPTY ARRAY, then let JV be an SQL/JSON array that has no SQL/JSON elements.
• 3) If ONERROR is EMPTY OBJECT, then let JV be an SQL/JSON object that has no SQL/JSON members.
——
Let me know if I’m missing something here.
The whole idea that a cast is implied outside of JSON_QUERY & co
might be covered by a clause that generally allows implementations
to cast as they like (don’t have the ref at hand, but I think
such a clause is somewhere). On the other hand, the 2023 standard
doesn’t even cover an **explicit** cast from character strings to
JSON as per 6.13 SR 7 (that’ where the matrix of source- and
destination types is given for cast).
So my bet is this:
* I’m pretty sure JSON parsing errors being subject to ON ERROR
is conforming.
That’s also “backed” by the Oracle and Db2 (LUW) implementations.
* Implying a CAST might be ok, but I have doubts.
* I don’t see how failing without being subject to ON ERRROR
(as it is now in 17beta1) could possibly covered by the standard.
But as we all know: the standard is confusing. If somebody thinks
differently, references would be greatly appreciated.
-markus
On 21.06.2024, at 06:46, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Thursday, June 20, 2024, Pavel Stehule <pavel.stehule@gmail.com> wrote:
pá 21. 6. 2024 v 6:01 odesílatel Amit Langote <amitlangote09@gmail.com> napsal:
On Fri, Jun 21, 2024 at 10:01 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:By the standard, it is implementation-defined whether JSON parsing errors
should be caught by ON ERROR clause.Absent someone contradicting that claim I retract my position here and am fine with failing if these "functions" are supplied with something that cannot be cast to json. I'd document them like functions that accept json with the implications that any casting to json happens before the function is called and thus its arguments do not apply to that step.
Thanks for that clarification.
So, there are the following options:
1. Disallow anything but jsonb for context_item (the patch I posted yesterday)
2. Continue allowing context_item to be non-json character or utf-8
encoded bytea strings, but document that any parsing errors do not
respect the ON ERROR clause.3. Go ahead and fix implicit casts to jsonb so that any parsing errors
respect ON ERROR (no patch written yet).David's vote seems to be 2, which is my inclination too. Markus' vote
seems to be either 1 or 3. Anyone else?
With a very strong preference of 3.
@3 can be possibly messy (although be near Oracle or standard). I don't think it is safe - one example '{a:10}' is valid for Oracle, but not for Postgres, and using @3 impacts different results (better to raise an exception).
The question of what is valid JSON is a different question, I guess. My original report is about something that is invalid everywhere. Having that in line would be a start. Also I believe Oracle’s habit to accept unquoted object keys is not covered by the standard (unless defined as a JSON format and also explicitly using the corresponding FORMAT clause).
The effect of @1 and @2 is similar - @1 is better so the user needs to explicitly cast, so maybe it is cleaner, so the cast should not be handled, @2 is more user friendly, because it accepts unknown string literal. From a developer perspective I prefer @1, from a user perspective I prefer @2. Maybe @2 is a good compromise.
2 also has the benefit of being standard conforming while 1 does not.
Why do you think so? Do you have any references or is this just based on previous statements in this discussion?
-markus
On Thursday, June 20, 2024, Markus Winand <markus.winand@winand.at> wrote:
On 21.06.2024, at 06:46, David G. Johnston <david.g.johnston@gmail.com>
wrote:
2 also has the benefit of being standard conforming while 1 does not.
Why do you think so? Do you have any references or is this just based on
previous statements in this discussion?
Hearsay.
/messages/by-id/CAFj8pRCnzO2cnHi5ebXciV=tuGVvAQOW9uPU+DQV1GkL31R=-g@mail.gmail.com
4) If ALREADY PARSED is False, then it is implementation-defined whether
the
following rules are applied:
a) The General Rules of Subclause 9.36, "Parsing JSON text", are applied
with
JT as JSON TEXT, an implementation-defined <JSON key uniqueness
constraint>
as UNIQUENESS CONSTRAINT, and FO as FORMAT OPTION; let ST be the STATUS
and
let CISJI be the SQL/JSON ITEM returned from the application of those
General Rules.
b) If ST is not successful completion, then ST is returned as the STATUS
of
this application of these General Rules, and no further General Rules of
this Subclause are applied.
But maybe I’m mis-interpreting that snippet and Nikita’s related commentary
regarding have chosen between options for this implementation-defined
feature.
David j.
On 21.06.2024, at 07:38, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Thursday, June 20, 2024, Markus Winand <markus.winand@winand.at> wrote:
On 21.06.2024, at 06:46, David G. Johnston <david.g.johnston@gmail.com> wrote:
2 also has the benefit of being standard conforming while 1 does not.
Why do you think so? Do you have any references or is this just based on previous statements in this discussion?
Hearsay.
/messages/by-id/CAFj8pRCnzO2cnHi5ebXciV=tuGVvAQOW9uPU+DQV1GkL31R=-g@mail.gmail.com
4) If ALREADY PARSED is False, then it is implementation-defined whether the
following rules are applied:
a) The General Rules of Subclause 9.36, "Parsing JSON text", are applied with
JT as JSON TEXT, an implementation-defined <JSON key uniqueness constraint>
as UNIQUENESS CONSTRAINT, and FO as FORMAT OPTION; let ST be the STATUS and
let CISJI be the SQL/JSON ITEM returned from the application of those
General Rules.
b) If ST is not successful completion, then ST is returned as the STATUS of
this application of these General Rules, and no further General Rules of
this Subclause are applied.But maybe I’m mis-interpreting that snippet and Nikita’s related commentary regarding have chosen between options for this implementation-defined feature.
Ah, here we go. Nowadays this is called IA050, “Whether a JSON context item that is not of the JSON data type is parsed.” (Likewise IA054 “Whether a JSON parameter is parsed.”)
So updating the three options:
1. Disallow anything but jsonb for context_item (the patch I posted yesterday)
* Non-conforming
* patch available
2. Continue allowing context_item to be non-json character or utf-8
encoded bytea strings, but document that any parsing errors do not
respect the ON ERROR clause.
* Conforming by choosing IA050 to implement GR4: raise errors independent of the ON ERROR clause.
* currently committed.
3. Go ahead and fix implicit casts to jsonb so that any parsing errors
respect ON ERROR (no patch written yet).
* Conforming by choosing IA050 not to implement GR4: Parsing happens later, considering the ON ERROR clause.
* no patch available, not trivial
I guess I’m the only one in favour of 3 ;) My remaining arguments are that Oracle and Db2 (LUW) do it that way and also that it is IMHO what users would expect. However, as 2 is also conforming (how could I miss that?), proper documentation is a very tempting option.
-markus
ps: Does anyone know a dialect that implements GR4?
Hi,
Thanks all for chiming in.
On Fri, Jun 21, 2024 at 8:00 PM Markus Winand <markus.winand@winand.at> wrote:
So updating the three options:
1. Disallow anything but jsonb for context_item (the patch I posted yesterday)
* Non-conforming
* patch available2. Continue allowing context_item to be non-json character or utf-8
encoded bytea strings, but document that any parsing errors do not
respect the ON ERROR clause.* Conforming by choosing IA050 to implement GR4: raise errors independent of the ON ERROR clause.
* currently committed.3. Go ahead and fix implicit casts to jsonb so that any parsing errors
respect ON ERROR (no patch written yet).* Conforming by choosing IA050 not to implement GR4: Parsing happens later, considering the ON ERROR clause.
* no patch available, not trivialI guess I’m the only one in favour of 3 ;) My remaining arguments are that Oracle and Db2 (LUW) do it that way and also that it is IMHO what users would expect. However, as 2 is also conforming (how could I miss that?), proper documentation is a very tempting option.
So, we should go with 2 for v17, because while 3 may be very
appealing, there's a risk that it might not get done in the time
remaining for v17.
I'll post the documentation patch on Monday.
--
Thanks, Amit Langote
Hi,
On Sat, Jun 22, 2024 at 5:43 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Fri, Jun 21, 2024 at 8:00 PM Markus Winand <markus.winand@winand.at> wrote:
So updating the three options:
1. Disallow anything but jsonb for context_item (the patch I posted yesterday)
* Non-conforming
* patch available2. Continue allowing context_item to be non-json character or utf-8
encoded bytea strings, but document that any parsing errors do not
respect the ON ERROR clause.* Conforming by choosing IA050 to implement GR4: raise errors independent of the ON ERROR clause.
* currently committed.3. Go ahead and fix implicit casts to jsonb so that any parsing errors
respect ON ERROR (no patch written yet).* Conforming by choosing IA050 not to implement GR4: Parsing happens later, considering the ON ERROR clause.
* no patch available, not trivialI guess I’m the only one in favour of 3 ;) My remaining arguments are that Oracle and Db2 (LUW) do it that way and also that it is IMHO what users would expect. However, as 2 is also conforming (how could I miss that?), proper documentation is a very tempting option.
So, we should go with 2 for v17, because while 3 may be very
appealing, there's a risk that it might not get done in the time
remaining for v17.I'll post the documentation patch on Monday.
Here's that patch, which adds this note after table 9.16.3. SQL/JSON
Query Functions:
+ <note>
+ <para>
+ The <replaceable>context_item</replaceable> expression is converted to
+ <type>jsonb</type> by an implicit cast if the expression is not already of
+ type <type>jsonb</type>. Note, however, that any parsing errors that occur
+ during that conversion are thrown unconditionally, that is, are not
+ handled according to the (specified or implicit) <literal>ON
ERROR</literal>
+ clause.
+ </para>
+ </note>
Peter, sorry about the last-minute ask, but do you have any
thoughts/advice on conformance as discussed above?
--
Thanks, Amit Langote
Attachments:
v1-0001-SQL-JSON-Document-behavior-when-context_item-is-n.patchapplication/octet-stream; name=v1-0001-SQL-JSON-Document-behavior-when-context_item-is-n.patchDownload
From aca0c308013dbf434d54da7e68d20dd45601ac99 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlan@postgresql.org>
Date: Wed, 26 Jun 2024 17:46:45 +0900
Subject: [PATCH v1] SQL/JSON: Document behavior when context_item is not jsonb
The input document to functions JSON_EXISTS(), JSON_QUERY(),
JSON_VALUE(), and JSON_TABLE() can be specified as character or
UTF8-encoded bytea strings. These are automatically converted to jsonb
with an implicit cast before being passed to the jsonpath machinery.
In the current implementation, errors that occur when parsing the specified
string into a valid JSON document are thrown unconditionally. This means they
are not subject to the explicit or implicit ON ERROR clause of those functions.
This behavior is standard-conforming.
Reported-by: Markus Winand
Discussion: https://postgr.es/m/F7DD1442-265C-4220-A603-CB0DEB77E91D%40winand.at
---
doc/src/sgml/func.sgml | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 85ebd65892..532c1cc28c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18870,6 +18870,16 @@ DETAIL: Missing "]" after array dimensions.
</tbody>
</tgroup>
</table>
+ <note>
+ <para>
+ The <replaceable>context_item</replaceable> expression is converted to
+ <type>jsonb</type> by an implicit cast if the expression is not already of
+ type <type>jsonb</type>. Note, however, that any parsing errors that occur
+ during that conversion are thrown unconditionally, that is, are not
+ handled according to the (specified or implicit) <literal>ON ERROR</literal>
+ clause.
+ </para>
+ </note>
</sect2>
<sect2 id="functions-sqljson-table">
--
2.43.0
On Wed, Jun 26, 2024 at 9:10 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Sat, Jun 22, 2024 at 5:43 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Fri, Jun 21, 2024 at 8:00 PM Markus Winand <markus.winand@winand.at> wrote:
So updating the three options:
1. Disallow anything but jsonb for context_item (the patch I posted yesterday)
* Non-conforming
* patch available2. Continue allowing context_item to be non-json character or utf-8
encoded bytea strings, but document that any parsing errors do not
respect the ON ERROR clause.* Conforming by choosing IA050 to implement GR4: raise errors independent of the ON ERROR clause.
* currently committed.3. Go ahead and fix implicit casts to jsonb so that any parsing errors
respect ON ERROR (no patch written yet).* Conforming by choosing IA050 not to implement GR4: Parsing happens later, considering the ON ERROR clause.
* no patch available, not trivialI guess I’m the only one in favour of 3 ;) My remaining arguments are that Oracle and Db2 (LUW) do it that way and also that it is IMHO what users would expect. However, as 2 is also conforming (how could I miss that?), proper documentation is a very tempting option.
So, we should go with 2 for v17, because while 3 may be very
appealing, there's a risk that it might not get done in the time
remaining for v17.I'll post the documentation patch on Monday.
Here's that patch, which adds this note after table 9.16.3. SQL/JSON
Query Functions:+ <note> + <para> + The <replaceable>context_item</replaceable> expression is converted to + <type>jsonb</type> by an implicit cast if the expression is not already of + type <type>jsonb</type>. Note, however, that any parsing errors that occur + during that conversion are thrown unconditionally, that is, are not + handled according to the (specified or implicit) <literal>ON ERROR</literal> + clause. + </para> + </note>
I have pushed this.
--
Thanks, Amit Langote