SQL/JSON query functions context_item doc entry and type requirement

Started by jian heover 1 year ago15 messages
#1jian he
jian.universality@gmail.com

hi
based on gram.y and function transformJsonValueExpr.

gram.y:
| JSON_QUERY '('
json_value_expr ',' a_expr json_passing_clause_opt
json_returning_clause_opt
json_wrapper_behavior
json_quotes_clause_opt
json_behavior_clause_opt
')'

| JSON_EXISTS '('
json_value_expr ',' a_expr json_passing_clause_opt
json_on_error_clause_opt
')'

| JSON_VALUE '('
json_value_expr ',' a_expr json_passing_clause_opt
json_returning_clause_opt
json_behavior_clause_opt
')'

json_format_clause_opt contains:
| FORMAT_LA JSON
{
$$ = (Node *) makeJsonFormat(JS_FORMAT_JSON, JS_ENC_DEFAULT, @1);
}

That means, all the context_item can specify "FORMAT JSON" options,
in the meantime, do we need to update these functions
synopsis/signature in the doc?

some examples:
create table a(b jsonb);
create table a1(b int4range);
select json_value(b format json, 'strict $[*]' DEFAULT 9 ON ERROR) from a;
select json_value(b format json, 'strict $[*]' DEFAULT 9 ON ERROR) from a1;
select json_value(text '"1"' format json, 'strict $[*]' DEFAULT 9 ON ERROR);

------------------------------------------------
transformJsonValueExpr

/* Try to coerce to the target type. */
coerced = coerce_to_target_type(pstate, expr, exprtype,
targettype, -1,
COERCION_EXPLICIT,
COERCE_EXPLICIT_CAST,
location);

based on the function transformJsonValueExpr and subfunction
coerce_to_target_type,
for SQL/JSON query functions (JSON_EXISTS, JSON_QUERY, and JSON_VALUE)
the context_item requirement is any data type that not error out while
explicitly casting to jsonb in coerce_to_target_type.

I played around with it, I think these types can be used in context_item.
{char,text,bpchar,character varying } and these types of associated domains.
bytea data type too, but need specify "ENCODING UTF8".
e.g.
select json_value(bytea '"1"' format json ENCODING UTF8, 'strict $[*]'
DEFAULT 9 ON ERROR);

Maybe we can add some brief explanation in this para to explain more
about "context_item"
{
SQL/JSON functions JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE()
described in Table 9.52 can be used to query JSON documents. Each of
these functions apply a path_expression (the query) to a context_item
(the document); see Section 9.16.2 for more details on what
path_expression can contain.
}

#2Amit Langote
amitlangote09@gmail.com
In reply to: jian he (#1)
Re: SQL/JSON query functions context_item doc entry and type requirement

Hi,

On Tue, Jun 4, 2024 at 12:11 AM jian he <jian.universality@gmail.com> wrote:

hi
based on gram.y and function transformJsonValueExpr.

gram.y:
| JSON_QUERY '('
json_value_expr ',' a_expr json_passing_clause_opt
json_returning_clause_opt
json_wrapper_behavior
json_quotes_clause_opt
json_behavior_clause_opt
')'

| JSON_EXISTS '('
json_value_expr ',' a_expr json_passing_clause_opt
json_on_error_clause_opt
')'

| JSON_VALUE '('
json_value_expr ',' a_expr json_passing_clause_opt
json_returning_clause_opt
json_behavior_clause_opt
')'

json_format_clause_opt contains:
| FORMAT_LA JSON
{
$$ = (Node *) makeJsonFormat(JS_FORMAT_JSON, JS_ENC_DEFAULT, @1);
}

That means, all the context_item can specify "FORMAT JSON" options,
in the meantime, do we need to update these functions
synopsis/signature in the doc?

some examples:
create table a(b jsonb);
create table a1(b int4range);
select json_value(b format json, 'strict $[*]' DEFAULT 9 ON ERROR) from a;
select json_value(b format json, 'strict $[*]' DEFAULT 9 ON ERROR) from a1;
select json_value(text '"1"' format json, 'strict $[*]' DEFAULT 9 ON ERROR);

------------------------------------------------
transformJsonValueExpr

/* Try to coerce to the target type. */
coerced = coerce_to_target_type(pstate, expr, exprtype,
targettype, -1,
COERCION_EXPLICIT,
COERCE_EXPLICIT_CAST,
location);

based on the function transformJsonValueExpr and subfunction
coerce_to_target_type,
for SQL/JSON query functions (JSON_EXISTS, JSON_QUERY, and JSON_VALUE)
the context_item requirement is any data type that not error out while
explicitly casting to jsonb in coerce_to_target_type.

I played around with it, I think these types can be used in context_item.
{char,text,bpchar,character varying } and these types of associated domains.
bytea data type too, but need specify "ENCODING UTF8".
e.g.
select json_value(bytea '"1"' format json ENCODING UTF8, 'strict $[*]'
DEFAULT 9 ON ERROR);

Maybe we can add some brief explanation in this para to explain more
about "context_item"
{
SQL/JSON functions JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE()
described in Table 9.52 can be used to query JSON documents. Each of
these functions apply a path_expression (the query) to a context_item
(the document); see Section 9.16.2 for more details on what
path_expression can contain.
}

If I understand correctly, you're suggesting that we add a line to the
above paragraph to mention which types are appropriate for
context_item. How about we add the following:

<replaceable>context_item</replaceable> expression can be a value of
any type that can be cast to <type>jsonb</type>. This includes types
such as <type>char</type>, <type>text</type>, <type>bpchar</type>,
<type>character varying</type>, and <type>bytea</type> (with
<code>ENCODING UTF8</code>), as well as any domains over these types.

--
Thanks, Amit Langote

#3jian he
jian.universality@gmail.com
In reply to: Amit Langote (#2)
1 attachment(s)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Mon, Jun 17, 2024 at 2:43 PM Amit Langote <amitlangote09@gmail.com> wrote:

Hi,

On Tue, Jun 4, 2024 at 12:11 AM jian he <jian.universality@gmail.com> wrote:

hi
based on gram.y and function transformJsonValueExpr.

gram.y:
| JSON_QUERY '('
json_value_expr ',' a_expr json_passing_clause_opt
json_returning_clause_opt
json_wrapper_behavior
json_quotes_clause_opt
json_behavior_clause_opt
')'

| JSON_EXISTS '('
json_value_expr ',' a_expr json_passing_clause_opt
json_on_error_clause_opt
')'

| JSON_VALUE '('
json_value_expr ',' a_expr json_passing_clause_opt
json_returning_clause_opt
json_behavior_clause_opt
')'

json_format_clause_opt contains:
| FORMAT_LA JSON
{
$$ = (Node *) makeJsonFormat(JS_FORMAT_JSON, JS_ENC_DEFAULT, @1);
}

That means, all the context_item can specify "FORMAT JSON" options,
in the meantime, do we need to update these functions
synopsis/signature in the doc?

If I understand correctly, you're suggesting that we add a line to the
above paragraph to mention which types are appropriate for
context_item. How about we add the following:

<replaceable>context_item</replaceable> expression can be a value of
any type that can be cast to <type>jsonb</type>. This includes types
such as <type>char</type>, <type>text</type>, <type>bpchar</type>,
<type>character varying</type>, and <type>bytea</type> (with
<code>ENCODING UTF8</code>), as well as any domains over these types.

your wording looks ok to me. I want to add two sentences. so it becomes:

+   The <replaceable>context_item</replaceable> expression can be a value of
+    any type that can be cast to <type>jsonb</type>. This includes types
+   such as <type>char</type>,  <type>text</type>, <type>bpchar</type>,
+    <type>character varying</type>, and <type>bytea</type> (with
+    <code>ENCODING UTF8</code>), as well as any domains over these types.
+    The <replaceable>context_item</replaceable> expression can also
be followed with
+    <literal>FORMAT JSON</literal>, <literal>ENCODING UTF8</literal>.
+    These two options currently don't have actual meaning.
+    <literal>ENCODING UTF8</literal> can only be specified when
<replaceable>context_item</replaceable> type is <type>bytea</type>.

imho, "These two options currently don't have actual meaning." is accurate,
but still does not explain why we allow "FORMAT JSON ENCODING UTF8".
I think we may need an explanation for "FORMAT JSON ENCODING UTF8".
because json_array, json_object, json_serialize, json all didn't
mention the meaning of "[ FORMAT JSON [ ENCODING UTF8 ] ] ".

I added "[ FORMAT JSON [ ENCODING UTF8 ] ] " to the function
signature/synopsis of json_exists, json_query, json_value.

Attachments:

v1-0001-minor-SQL-JSON-Query-Functions-doc-fix.patchtext/x-patch; charset=US-ASCII; name=v1-0001-minor-SQL-JSON-Query-Functions-doc-fix.patchDownload
From c5e2b51d351c3987e96290363288499f1a369e49 Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Mon, 17 Jun 2024 17:35:01 +0800
Subject: [PATCH v1 1/1] minor SQL/JSON Query Functions doc fix

main fix doc context_item description.
---
 doc/src/sgml/func.sgml | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 17c44bc3..18db4a5b 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18663,7 +18663,17 @@ $.* ? (@ like_regex "^\\d+$")
    described in <xref linkend="functions-sqljson-querying"/> can be used
    to query JSON documents.  Each of these functions apply a
    <replaceable>path_expression</replaceable> (the query) to a
-   <replaceable>context_item</replaceable> (the document); see
+   <replaceable>context_item</replaceable> (the document).
+   The <replaceable>context_item</replaceable> expression can be a value of
+    any type that can be cast to <type>jsonb</type>. This includes types
+   such as <type>char</type>,  <type>text</type>, <type>bpchar</type>,
+    <type>character varying</type>, and <type>bytea</type> (with
+    <code>ENCODING UTF8</code>), as well as any domains over these types.
+    The <replaceable>context_item</replaceable> expression can also be followed with
+    <literal>FORMAT JSON</literal>, <literal>ENCODING UTF8</literal>.
+    These two options currently don't have actual meaning.
+    <literal>ENCODING UTF8</literal> can only be specified when <replaceable>context_item</replaceable> type is <type>bytea</type>.
+   See
    <xref linkend="functions-sqljson-path"/> for more details on what
    <replaceable>path_expression</replaceable> can contain.
   </para>
@@ -18689,7 +18699,8 @@ $.* ? (@ like_regex "^\\d+$")
       <entry role="func_table_entry"><para role="func_signature">
         <indexterm><primary>json_exists</primary></indexterm>
         <function>json_exists</function> (
-        <replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable> <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
+        <replaceable>context_item</replaceable> <optional> <literal>FORMAT JSON</literal> <optional> <literal>ENCODING UTF8</literal> </optional> </optional>,
+        <replaceable>path_expression</replaceable> <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
         <optional> { <literal>TRUE</literal> | <literal>FALSE</literal> |<literal> UNKNOWN</literal> | <literal>ERROR</literal> } <literal>ON ERROR</literal> </optional>)
        </para>
        <para>
@@ -18729,7 +18740,8 @@ ERROR:  jsonpath array subscript is out of bounds
       <entry role="func_table_entry"><para role="func_signature">
         <indexterm><primary>json_query</primary></indexterm>
         <function>json_query</function> (
-        <replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable> <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
+        <replaceable>context_item</replaceable> <optional> <literal>FORMAT JSON</literal> <optional> <literal>ENCODING UTF8</literal> </optional> </optional>,
+        <replaceable>path_expression</replaceable> <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
         <optional> <literal>RETURNING</literal> <replaceable>data_type</replaceable> <optional> <literal>FORMAT JSON</literal> <optional> <literal>ENCODING UTF8</literal> </optional> </optional> </optional>
         <optional> { <literal>WITHOUT</literal> | <literal>WITH</literal> { <literal>CONDITIONAL</literal> | <optional><literal>UNCONDITIONAL</literal></optional> } } <optional> <literal>ARRAY</literal> </optional> <literal>WRAPPER</literal> </optional>
         <optional> { <literal>KEEP</literal> | <literal>OMIT</literal> } <literal>QUOTES</literal> <optional> <literal>ON SCALAR STRING</literal> </optional> </optional>
@@ -18806,7 +18818,8 @@ DETAIL:  Missing "]" after array dimensions.
       <entry role="func_table_entry"><para role="func_signature">
         <indexterm><primary>json_value</primary></indexterm>
         <function>json_value</function> (
-        <replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable>
+        <replaceable>context_item</replaceable> <optional> <literal>FORMAT JSON</literal> <optional> <literal>ENCODING UTF8</literal> </optional> </optional>,
+        <replaceable>path_expression</replaceable>
         <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
         <optional> <literal>RETURNING</literal> <replaceable>data_type</replaceable> </optional>
         <optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON EMPTY</literal> </optional>
-- 
2.34.1

#4Chapman Flack
jcflack@acm.org
In reply to: Amit Langote (#2)
Re: SQL/JSON query functions context_item doc entry and type requirement

Hi,

On 06/17/24 02:43, Amit Langote wrote:

<replaceable>context_item</replaceable> expression can be a value of
any type that can be cast to <type>jsonb</type>. This includes types
such as <type>char</type>, <type>text</type>, <type>bpchar</type>,
<type>character varying</type>, and <type>bytea</type> (with
<code>ENCODING UTF8</code>), as well as any domains over these types.

Reading this message in conjunction with [0]/messages/by-id/CA+HiwqGuqLfAEP-FwW3QHByfQOoUpyj6YZG6R6bScpQswvNYDA@mail.gmail.com makes me think that we are
really talking about a function that takes a first parameter of type jsonb,
and behaves exactly that way (so any cast required is applied by the system
ahead of the call). Under those conditions, this seems like an unusual
sentence to add in the docs, at least until we have also documented that
tan's argument can be of any type that can be cast to double precision.

On the other hand, if the behavior of the functions were to be changed
(perhaps using prosupport rewriting as suggested in [1]/messages/by-id/66703054.6040109@acm.org?) so that it was
not purely describable as a function accepting exactly jsonb with a
possible system-applied cast in front, then in that case such an added
explanation in the docs might be very fitting.

Regards,
-Chap

[0]: /messages/by-id/CA+HiwqGuqLfAEP-FwW3QHByfQOoUpyj6YZG6R6bScpQswvNYDA@mail.gmail.com
/messages/by-id/CA+HiwqGuqLfAEP-FwW3QHByfQOoUpyj6YZG6R6bScpQswvNYDA@mail.gmail.com
[1]: /messages/by-id/66703054.6040109@acm.org

#5jian he
jian.universality@gmail.com
In reply to: Chapman Flack (#4)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Mon, Jun 17, 2024 at 9:05 PM Chapman Flack <jcflack@acm.org> wrote:

Hi,

On 06/17/24 02:43, Amit Langote wrote:

<replaceable>context_item</replaceable> expression can be a value of
any type that can be cast to <type>jsonb</type>. This includes types
such as <type>char</type>, <type>text</type>, <type>bpchar</type>,
<type>character varying</type>, and <type>bytea</type> (with
<code>ENCODING UTF8</code>), as well as any domains over these types.

Reading this message in conjunction with [0] makes me think that we are
really talking about a function that takes a first parameter of type jsonb,
and behaves exactly that way (so any cast required is applied by the system
ahead of the call). Under those conditions, this seems like an unusual
sentence to add in the docs, at least until we have also documented that
tan's argument can be of any type that can be cast to double precision.

I guess it would be fine to add an unusual sentence to the docs.

imagine a function: array_avg(anyarray) returns anyelement.
array_avg calculate an array's elements's avg. like
array('{1,2,3}'::int[]) returns 2.
but array_avg won't make sense if the input argument is a date array.
so mentioning in the doc: array_avg can accept anyarray, but anyarray
cannot date array.
seems ok.

On the other hand, if the behavior of the functions were to be changed
(perhaps using prosupport rewriting as suggested in [1]?) so that it was
not purely describable as a function accepting exactly jsonb with a
possible system-applied cast in front, then in that case such an added
explanation in the docs might be very fitting.

prosupport won't work, I think.
because json_exists, json_value, json_query, json_table don't have
pg_proc entries.
These are more like expressions.

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: jian he (#5)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Wed, Jun 19, 2024 at 8:29 AM jian he <jian.universality@gmail.com> wrote:

On Mon, Jun 17, 2024 at 9:05 PM Chapman Flack <jcflack@acm.org> wrote:

Hi,

On 06/17/24 02:43, Amit Langote wrote:

<replaceable>context_item</replaceable> expression can be a value of
any type that can be cast to <type>jsonb</type>. This includes types
such as <type>char</type>, <type>text</type>, <type>bpchar</type>,
<type>character varying</type>, and <type>bytea</type> (with
<code>ENCODING UTF8</code>), as well as any domains over these types.

Reading this message in conjunction with [0] makes me think that we are
really talking about a function that takes a first parameter of type

jsonb,

and behaves exactly that way (so any cast required is applied by the

system

ahead of the call). Under those conditions, this seems like an unusual
sentence to add in the docs, at least until we have also documented that
tan's argument can be of any type that can be cast to double precision.

I guess it would be fine to add an unusual sentence to the docs.

imagine a function: array_avg(anyarray) returns anyelement.
array_avg calculate an array's elements's avg. like
array('{1,2,3}'::int[]) returns 2.
but array_avg won't make sense if the input argument is a date array.
so mentioning in the doc: array_avg can accept anyarray, but anyarray
cannot date array.
seems ok.

There is existing wording for this:

"The expression can be of any JSON type, any character string type, or
bytea in UTF8 encoding."

If you add this sentence to the paragraph the link that already exists,
which simply points the reader to this sentence, becomes redundant and
should be removed.

As for table 9.16.3 - it is unwieldy already. Lets try and make the core
syntax shorter, not longer. We already have precedence in the subsequent
json_table section - give each major clause item a name then below the
table define the syntax and meaning for those names. Unlike in that
section - which probably should be modified too - context_item should have
its own description line.

David J.

#7Amit Langote
amitlangote09@gmail.com
In reply to: David G. Johnston (#6)
1 attachment(s)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Thu, Jun 20, 2024 at 1:03 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:

On Wed, Jun 19, 2024 at 8:29 AM jian he <jian.universality@gmail.com> wrote:

On Mon, Jun 17, 2024 at 9:05 PM Chapman Flack <jcflack@acm.org> wrote:

Hi,

On 06/17/24 02:43, Amit Langote wrote:

<replaceable>context_item</replaceable> expression can be a value of
any type that can be cast to <type>jsonb</type>. This includes types
such as <type>char</type>, <type>text</type>, <type>bpchar</type>,
<type>character varying</type>, and <type>bytea</type> (with
<code>ENCODING UTF8</code>), as well as any domains over these types.

Reading this message in conjunction with [0] makes me think that we are
really talking about a function that takes a first parameter of type jsonb,
and behaves exactly that way (so any cast required is applied by the system
ahead of the call). Under those conditions, this seems like an unusual
sentence to add in the docs, at least until we have also documented that
tan's argument can be of any type that can be cast to double precision.

I guess it would be fine to add an unusual sentence to the docs.

imagine a function: array_avg(anyarray) returns anyelement.
array_avg calculate an array's elements's avg. like
array('{1,2,3}'::int[]) returns 2.
but array_avg won't make sense if the input argument is a date array.
so mentioning in the doc: array_avg can accept anyarray, but anyarray
cannot date array.
seems ok.

There is existing wording for this:

"The expression can be of any JSON type, any character string type, or bytea in UTF8 encoding."

If you add this sentence to the paragraph the link that already exists, which simply points the reader to this sentence, becomes redundant and should be removed.

I've just posted a patch in the other thread [1]/messages/by-id/CA+HiwqF2Z6FATWQV6bG9NeKYf=++fOgmdbYc9gWSNJ81jfqCuA@mail.gmail.com to restrict
context_item to be of jsonb type, which users would need to ensure by
adding an explicit cast if needed. I think that makes this
clarification unnecessary.

As for table 9.16.3 - it is unwieldy already. Lets try and make the core syntax shorter, not longer. We already have precedence in the subsequent json_table section - give each major clause item a name then below the table define the syntax and meaning for those names. Unlike in that section - which probably should be modified too - context_item should have its own description line.

I had posted a patch a little while ago at [1]/messages/by-id/CA+HiwqF2Z6FATWQV6bG9NeKYf=++fOgmdbYc9gWSNJ81jfqCuA@mail.gmail.com to render the syntax a
bit differently with each function getting its own syntax synopsis.
Resending it here; have addressed Jian He's comments.

--
Thanks, Amit Langote
[1]: /messages/by-id/CA+HiwqF2Z6FATWQV6bG9NeKYf=++fOgmdbYc9gWSNJ81jfqCuA@mail.gmail.com

Attachments:

v1-0001-SQL-JSON-Various-fixes-to-SQL-JSON-query-function.patchapplication/octet-stream; name=v1-0001-SQL-JSON-Various-fixes-to-SQL-JSON-query-function.patchDownload
From 4387c338bfe446b9a894f38b79fb82e816960c39 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlan@postgresql.org>
Date: Mon, 20 May 2024 20:04:49 +0900
Subject: [PATCH v1] SQL/JSON: Various fixes to SQL/JSON query function docs

1. Remove the keyword SELECT from the examples to be consistent
with the examples of other JSON-related functions listed on the
same page.

2. Add <synopsis> tags around the function's syntax definition.

3. Capitalize function names in the syntax synopsis and the examples.

4. Remove unnecessary <para> tags from the functions' description
text to combine it into one paragraph.

While at it, rewrite the sentence about the RETURNING clause in
JSON_QUERY() and JSON_VALUE()'s description to clarify that coercion
may occur if needed.

Suggested-by: Thom Brown <thom@linux.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CAA-aLv7Dfy9BMrhUZ1skcg=OdqysWKzObS7XiDXdotJNF0E44Q@mail.gmail.com
---
 doc/src/sgml/func.sgml | 81 +++++++++++++++++++++++-------------------
 1 file changed, 45 insertions(+), 36 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2609269610..f9456ee880 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18691,9 +18691,12 @@ $.* ? (@ like_regex "^\\d+$")
      <row>
       <entry role="func_table_entry"><para role="func_signature">
         <indexterm><primary>json_exists</primary></indexterm>
-        <function>json_exists</function> (
-        <replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable> <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
-        <optional> { <literal>TRUE</literal> | <literal>FALSE</literal> |<literal> UNKNOWN</literal> | <literal>ERROR</literal> } <literal>ON ERROR</literal> </optional>)
+<synopsis>
+<function>JSON_EXISTS</function> (
+<replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable>
+<optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
+<optional>{ <literal>TRUE</literal> | <literal>FALSE</literal> |<literal> UNKNOWN</literal> | <literal>ERROR</literal> } <literal>ON ERROR</literal> </optional>) <returnvalue>boolean</returnvalue>
+</synopsis>
        </para>
        <para>
         Returns true if the SQL/JSON <replaceable>path_expression</replaceable>
@@ -18713,15 +18716,15 @@ $.* ? (@ like_regex "^\\d+$")
         Examples:
        </para>
        <para>
-        <literal>select json_exists(jsonb '{"key1": [1,2,3]}', 'strict $.key1[*] ? (@ > 2)')</literal>
+        <literal>JSON_EXISTS(jsonb '{"key1": [1,2,3]}', 'strict $.key1[*] ? (@ > 2)')</literal>
         <returnvalue>t</returnvalue>
        </para>
        <para>
-        <literal>select json_exists(jsonb '{"a": [1,2,3]}', 'lax $.a[5]' ERROR ON ERROR)</literal>
+        <literal>JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'lax $.a[5]' ERROR ON ERROR)</literal>
         <returnvalue>f</returnvalue>
        </para>
        <para>
-        <literal>select json_exists(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' ERROR ON ERROR)</literal>
+        <literal>JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' ERROR ON ERROR)</literal>
         <returnvalue></returnvalue>
 <programlisting>
 ERROR:  jsonpath array subscript is out of bounds
@@ -18731,13 +18734,16 @@ ERROR:  jsonpath array subscript is out of bounds
      <row>
       <entry role="func_table_entry"><para role="func_signature">
         <indexterm><primary>json_query</primary></indexterm>
-        <function>json_query</function> (
-        <replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable> <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
-        <optional> <literal>RETURNING</literal> <replaceable>data_type</replaceable> <optional> <literal>FORMAT JSON</literal> <optional> <literal>ENCODING UTF8</literal> </optional> </optional> </optional>
-        <optional> { <literal>WITHOUT</literal> | <literal>WITH</literal> { <literal>CONDITIONAL</literal> | <optional><literal>UNCONDITIONAL</literal></optional> } } <optional> <literal>ARRAY</literal> </optional> <literal>WRAPPER</literal> </optional>
-        <optional> { <literal>KEEP</literal> | <literal>OMIT</literal> } <literal>QUOTES</literal> <optional> <literal>ON SCALAR STRING</literal> </optional> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>EMPTY</literal> { <optional> <literal>ARRAY</literal> </optional> | <literal>OBJECT</literal> } | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON EMPTY</literal> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>EMPTY</literal> { <optional> <literal>ARRAY</literal> </optional> | <literal>OBJECT</literal> } | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON ERROR</literal> </optional>)
+<synopsis>
+<function>JSON_QUERY</function> (
+<replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable>
+<optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
+<optional> <literal>RETURNING</literal> <replaceable>data_type</replaceable> <optional> <literal>FORMAT JSON</literal> <optional> <literal>ENCODING UTF8</literal> </optional> </optional> </optional>
+<optional> { <literal>WITHOUT</literal> | <literal>WITH</literal> { <literal>CONDITIONAL</literal> | <optional><literal>UNCONDITIONAL</literal></optional> } } <optional> <literal>ARRAY</literal> </optional> <literal>WRAPPER</literal> </optional>
+<optional> { <literal>KEEP</literal> | <literal>OMIT</literal> } <literal>QUOTES</literal> <optional> <literal>ON SCALAR STRING</literal> </optional> </optional>
+<optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>EMPTY</literal> { <optional> <literal>ARRAY</literal> </optional> | <literal>OBJECT</literal> } | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON EMPTY</literal> </optional>
+<optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>EMPTY</literal> { <optional> <literal>ARRAY</literal> </optional> | <literal>OBJECT</literal> } | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON ERROR</literal> </optional>) <returnvalue>jsonb</returnvalue>
+</synopsis>
       </para>
        <para>
         Returns the result of applying the SQL/JSON
@@ -18746,6 +18752,7 @@ ERROR:  jsonpath array subscript is out of bounds
         <literal>PASSING</literal> <replaceable>value</replaceable>s.
        </para>
        <para>
+        Returns the result of applying the SQL/JSON
         If the path expression returns multiple SQL/JSON items, it might be
         necessary to wrap the result using the <literal>WITH WRAPPER</literal>
         clause to make it a valid JSON string.  If the wrapper is
@@ -18765,8 +18772,8 @@ ERROR:  jsonpath array subscript is out of bounds
        </para>
        <para>
         The <literal>RETURNING</literal> clause can be used to specify the
-        <replaceable>data_type</replaceable> of the result value.  By default,
-        the returned value will be of type <type>jsonb</type>.
+        <replaceable>data_type</replaceable> to coerce the result value to,
+        which is of type <type>jsonb</type> by default.
        </para>
        <para>
         The <literal>ON EMPTY</literal> clause specifies the behavior if
@@ -18777,9 +18784,9 @@ ERROR:  jsonpath array subscript is out of bounds
        <para>
         The <literal>ON ERROR</literal> clause specifies the
         behavior if an error occurs when evaluating
-        <replaceable>path_expression</replaceable>, including the operation to
-        coerce the result value to the output type, or during the execution of
-        <literal>ON EMPTY</literal> behavior (that is caused by empty result
+        <replaceable>path_expression</replaceable>, when coercing the result
+        value to the <literal>RETURNING</literal> type, or during the execution
+        of <literal>ON EMPTY</literal> behavior (that is caused by empty result
         of <replaceable>path_expression</replaceable> evaluation).  The default
         when <literal>ON ERROR</literal> is not specified is to return a null
         value.
@@ -18788,15 +18795,15 @@ ERROR:  jsonpath array subscript is out of bounds
         Examples:
        </para>
        <para>
-        <literal>select json_query(jsonb '[1,[2,3],null]', 'lax $[*][1]' WITH CONDITIONAL WRAPPER)</literal>
+        <literal>JSON_QUERY(jsonb '[1,[2,3],null]', 'lax $[*][1]' WITH CONDITIONAL WRAPPER)</literal>
         <returnvalue>[3]</returnvalue>
        </para>
        <para>
-        <literal>select json_query(jsonb '{"a": "[1, 2]"}', 'lax $.a' OMIT QUOTES);</literal>
+        <literal>JSON_QUERY(jsonb '{"a": "[1, 2]"}', 'lax $.a' OMIT QUOTES);</literal>
         <returnvalue>[1, 2]</returnvalue>
        </para>
        <para>
-        <literal>select json_query(jsonb '{"a": "[1, 2]"}', 'lax $.a' RETURNING int[] OMIT QUOTES ERROR ON ERROR);</literal>
+        <literal>JSON_QUERY(jsonb '{"a": "[1, 2]"}', 'lax $.a' RETURNING int[] OMIT QUOTES ERROR ON ERROR);</literal>
         <returnvalue></returnvalue>
 <programlisting>
 ERROR:  malformed array literal: "[1, 2]"
@@ -18808,12 +18815,14 @@ DETAIL:  Missing "]" after array dimensions.
      <row>
       <entry role="func_table_entry"><para role="func_signature">
         <indexterm><primary>json_value</primary></indexterm>
-        <function>json_value</function> (
-        <replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable>
-        <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
-        <optional> <literal>RETURNING</literal> <replaceable>data_type</replaceable> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON EMPTY</literal> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON ERROR</literal> </optional>)
+<synopsis>
+<function>JSON_VALUE</function> (
+<replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable>
+<optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
+<optional> <literal>RETURNING</literal> <replaceable>data_type</replaceable> </optional>
+<optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON EMPTY</literal> </optional>
+<optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON ERROR</literal> </optional>) <returnvalue>text</returnvalue>
+</synopsis>
        </para>
        <para>
         Returns the result of applying the SQL/JSON
@@ -18825,36 +18834,36 @@ DETAIL:  Missing "]" after array dimensions.
         The extracted value must be a single <acronym>SQL/JSON</acronym>
         scalar item; an error is thrown if that's not the case.  If you expect
         that extracted value might be an object or an array, use the
-        <function>json_query</function> function instead.
+        <function>JSON_QUERY</function> function instead.
        </para>
        <para>
         The <literal>RETURNING</literal> clause can be used to specify the
-        <replaceable>data_type</replaceable> of the result value. By default,
-        the returned value will be of type <type>text</type>.
+        <replaceable>data_type</replaceable> to coerce the result value to,
+        which is of type <type>text</type> by default.
        </para>
        <para>
         The <literal>ON ERROR</literal> and <literal>ON EMPTY</literal>
         clauses have similar semantics as mentioned in the description of
-        <function>json_query</function>.
+        <function>JSON_QUERY</function>.
        </para>
        <para>
-        Note that scalar strings returned by <function>json_value</function>
+        Note that scalar strings returned by <function>JSON_VALUE</function>
         always have their quotes removed, equivalent to specifying
-        <literal>OMIT QUOTES</literal> in <function>json_query</function>.
+        <literal>OMIT QUOTES</literal> in <function>JSON_QUERY</function>.
        </para>
        <para>
         Examples:
        </para>
        <para>
-        <literal>select json_value(jsonb '"123.45"', '$' RETURNING float)</literal>
+        <literal>JSON_VALUE(jsonb '"123.45"', '$' RETURNING float)</literal>
         <returnvalue>123.45</returnvalue>
        </para>
        <para>
-        <literal>select json_value(jsonb '"03:04 2015-02-01"', '$.datetime("HH24:MI&nbsp;YYYY-MM-DD")' RETURNING date)</literal>
+        <literal>JSON_VALUE(jsonb '"03:04 2015-02-01"', '$.datetime("HH24:MI&nbsp;YYYY-MM-DD")' RETURNING date)</literal>
         <returnvalue>2015-02-01</returnvalue>
        </para>
        <para>
-        <literal>select json_value(jsonb '[1,2]', 'strict $[*]' DEFAULT 9 ON ERROR)</literal>
+        <literal>JSON_VALUE(jsonb '[1,2]', 'strict $[*]' DEFAULT 9 ON ERROR)</literal>
         <returnvalue>9</returnvalue>
       </para></entry>
      </row>
-- 
2.43.0

#8jian he
jian.universality@gmail.com
In reply to: Amit Langote (#7)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Thu, Jun 20, 2024 at 5:46 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Thu, Jun 20, 2024 at 1:03 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:

On Wed, Jun 19, 2024 at 8:29 AM jian he <jian.universality@gmail.com> wrote:

On Mon, Jun 17, 2024 at 9:05 PM Chapman Flack <jcflack@acm.org> wrote:

Hi,

On 06/17/24 02:43, Amit Langote wrote:

<replaceable>context_item</replaceable> expression can be a value of
any type that can be cast to <type>jsonb</type>. This includes types
such as <type>char</type>, <type>text</type>, <type>bpchar</type>,
<type>character varying</type>, and <type>bytea</type> (with
<code>ENCODING UTF8</code>), as well as any domains over these types.

Reading this message in conjunction with [0] makes me think that we are
really talking about a function that takes a first parameter of type jsonb,
and behaves exactly that way (so any cast required is applied by the system
ahead of the call). Under those conditions, this seems like an unusual
sentence to add in the docs, at least until we have also documented that
tan's argument can be of any type that can be cast to double precision.

I guess it would be fine to add an unusual sentence to the docs.

imagine a function: array_avg(anyarray) returns anyelement.
array_avg calculate an array's elements's avg. like
array('{1,2,3}'::int[]) returns 2.
but array_avg won't make sense if the input argument is a date array.
so mentioning in the doc: array_avg can accept anyarray, but anyarray
cannot date array.
seems ok.

There is existing wording for this:

"The expression can be of any JSON type, any character string type, or bytea in UTF8 encoding."

If you add this sentence to the paragraph the link that already exists, which simply points the reader to this sentence, becomes redundant and should be removed.

I've just posted a patch in the other thread [1] to restrict
context_item to be of jsonb type, which users would need to ensure by
adding an explicit cast if needed. I think that makes this
clarification unnecessary.

As for table 9.16.3 - it is unwieldy already. Lets try and make the core syntax shorter, not longer. We already have precedence in the subsequent json_table section - give each major clause item a name then below the table define the syntax and meaning for those names. Unlike in that section - which probably should be modified too - context_item should have its own description line.

I had posted a patch a little while ago at [1] to render the syntax a
bit differently with each function getting its own syntax synopsis.
Resending it here; have addressed Jian He's comments.

--

@@ -18746,6 +18752,7 @@ ERROR:  jsonpath array subscript is out of bounds
         <literal>PASSING</literal> <replaceable>value</replaceable>s.
        </para>
        <para>
+        Returns the result of applying the SQL/JSON
         If the path expression returns multiple SQL/JSON items, it might be
         necessary to wrap the result using the <literal>WITH WRAPPER</literal>
         clause to make it a valid JSON string.  If the wrapper is

+ Returns the result of applying the SQL/JSON
is redundant?

playing around with it.
found some minor issues:

json_exists allow: DEFAULT expression ON ERROR, which is not
mentioned in the doc.
for example:
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default
true ON ERROR);
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 0 ON ERROR);
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 11 ON ERROR);

JSON_VALUE on error, on empty semantics should be the same as json_query.
like:
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON ERROR ])

examples:
select JSON_value(jsonb '[]' , '$' empty array on error);
select JSON_value(jsonb '[]' , '$' empty object on error);

#9David G. Johnston
david.g.johnston@gmail.com
In reply to: jian he (#8)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Thu, Jun 20, 2024 at 9:01 AM jian he <jian.universality@gmail.com> wrote:

On Thu, Jun 20, 2024 at 5:46 PM Amit Langote <amitlangote09@gmail.com>
wrote:

On Thu, Jun 20, 2024 at 1:03 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:

On Wed, Jun 19, 2024 at 8:29 AM jian he <jian.universality@gmail.com>

wrote:

On Mon, Jun 17, 2024 at 9:05 PM Chapman Flack <jcflack@acm.org>

wrote:

Hi,

On 06/17/24 02:43, Amit Langote wrote:

<replaceable>context_item</replaceable> expression can be a value

of

any type that can be cast to <type>jsonb</type>. This includes

types

such as <type>char</type>, <type>text</type>,

<type>bpchar</type>,

<type>character varying</type>, and <type>bytea</type> (with
<code>ENCODING UTF8</code>), as well as any domains over these

types.

Reading this message in conjunction with [0] makes me think that we

are

really talking about a function that takes a first parameter of

type jsonb,

and behaves exactly that way (so any cast required is applied by

the system

ahead of the call). Under those conditions, this seems like an

unusual

sentence to add in the docs, at least until we have also documented

that

tan's argument can be of any type that can be cast to double

precision.

I guess it would be fine to add an unusual sentence to the docs.

imagine a function: array_avg(anyarray) returns anyelement.
array_avg calculate an array's elements's avg. like
array('{1,2,3}'::int[]) returns 2.
but array_avg won't make sense if the input argument is a date array.
so mentioning in the doc: array_avg can accept anyarray, but anyarray
cannot date array.
seems ok.

There is existing wording for this:

"The expression can be of any JSON type, any character string type, or

bytea in UTF8 encoding."

If you add this sentence to the paragraph the link that already

exists, which simply points the reader to this sentence, becomes redundant
and should be removed.

I've just posted a patch in the other thread [1] to restrict
context_item to be of jsonb type, which users would need to ensure by
adding an explicit cast if needed. I think that makes this
clarification unnecessary.

As for table 9.16.3 - it is unwieldy already. Lets try and make the

core syntax shorter, not longer. We already have precedence in the
subsequent json_table section - give each major clause item a name then
below the table define the syntax and meaning for those names. Unlike in
that section - which probably should be modified too - context_item should
have its own description line.

I had posted a patch a little while ago at [1] to render the syntax a
bit differently with each function getting its own syntax synopsis.
Resending it here; have addressed Jian He's comments.

--

I was thinking more like:

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c324906b22..b9d157663a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18692,8 +18692,10 @@ $.* ? (@ like_regex "^\\d+$")
       <entry role="func_table_entry"><para role="func_signature">
         <indexterm><primary>json_exists</primary></indexterm>
         <function>json_exists</function> (
-        <replaceable>context_item</replaceable>,
<replaceable>path_expression</replaceable> <optional>
<literal>PASSING</literal> { <replaceable>value</replaceable>
<literal>AS</literal> <replaceable>varname</replaceable> } <optional>,
...</optional></optional>
-        <optional> { <literal>TRUE</literal> | <literal>FALSE</literal>
|<literal> UNKNOWN</literal> | <literal>ERROR</literal> } <literal>ON
ERROR</literal> </optional>)
+        <replaceable>context_item</replaceable>,
+        <replaceable>path_expression</replaceable>
+        <optional>variable_definitions</optional>
+        <optional>on_error_boolean</optional>)
        </para>
        <para>
         Returns true if the SQL/JSON
<replaceable>path_expression</replaceable>
@@ -18732,12 +18734,14 @@ ERROR:  jsonpath array subscript is out of bounds
       <entry role="func_table_entry"><para role="func_signature">
         <indexterm><primary>json_query</primary></indexterm>
         <function>json_query</function> (
-        <replaceable>context_item</replaceable>,
<replaceable>path_expression</replaceable> <optional>
<literal>PASSING</literal> { <replaceable>value</replaceable>
<literal>AS</literal> <replaceable>varname</replaceable> } <optional>,
...</optional></optional>
-        <optional> <literal>RETURNING</literal>
<replaceable>data_type</replaceable> <optional> <literal>FORMAT
JSON</literal> <optional> <literal>ENCODING UTF8</literal> </optional>
</optional> </optional>
-        <optional> { <literal>WITHOUT</literal> | <literal>WITH</literal>
{ <literal>CONDITIONAL</literal> |
<optional><literal>UNCONDITIONAL</literal></optional> } } <optional>
<literal>ARRAY</literal> </optional> <literal>WRAPPER</literal> </optional>
-        <optional> { <literal>KEEP</literal> | <literal>OMIT</literal> }
<literal>QUOTES</literal> <optional> <literal>ON SCALAR STRING</literal>
</optional> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> |
<literal>EMPTY</literal> { <optional> <literal>ARRAY</literal> </optional>
| <literal>OBJECT</literal> } | <literal>DEFAULT</literal>
<replaceable>expression</replaceable> } <literal>ON EMPTY</literal>
</optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> |
<literal>EMPTY</literal> { <optional> <literal>ARRAY</literal> </optional>
| <literal>OBJECT</literal> } | <literal>DEFAULT</literal>
<replaceable>expression</replaceable> } <literal>ON ERROR</literal>
</optional>)
+        <replaceable>context_item</replaceable>,
+        <replaceable>path_expression</replaceable>
+        <optional>variable_definitions</optional>
+        <optional>return_clause</optional>
+        <optional>wrapping_clause</optional>
+        <optional>quoting_clause</optional>
+        <optional>on_empty_set</optional>
+        <optional>on_error_set</optional>)
       </para>
        <para>
         Returns the result of applying the SQL/JSON
@@ -18809,11 +18813,12 @@ DETAIL:  Missing "]" after array dimensions.
       <entry role="func_table_entry"><para role="func_signature">
         <indexterm><primary>json_value</primary></indexterm>
         <function>json_value</function> (
-        <replaceable>context_item</replaceable>,
<replaceable>path_expression</replaceable>
-        <optional> <literal>PASSING</literal> {
<replaceable>value</replaceable> <literal>AS</literal>
<replaceable>varname</replaceable> } <optional>, ...</optional></optional>
-        <optional> <literal>RETURNING</literal>
<replaceable>data_type</replaceable> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> |
<literal>DEFAULT</literal> <replaceable>expression</replaceable> }
<literal>ON EMPTY</literal> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> |
<literal>DEFAULT</literal> <replaceable>expression</replaceable> }
<literal>ON ERROR</literal> </optional>)
+        <replaceable>context_item</replaceable>,
+        <replaceable>path_expression</replaceable>
+        <optional>variable_definitions</optional>
+        <optional>return_type</optional>
+        <optional>on_empty_value</optional>
+        <optional>on_error_value</optional>)
        </para>
        <para>
         Returns the result of applying the SQL/JSON

Then defining each of those below the table - keeping the on_error variants
together.

playing around with it.
found some minor issues:

json_exists allow: DEFAULT expression ON ERROR, which is not
mentioned in the doc.
for example:
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default
true ON ERROR);
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 0 ON
ERROR);
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 11 ON
ERROR);

Yeah, surprised it works, the documented behavior seems logical. Being
able to return a non-boolean here seems odd. Especially since it is cast
to boolean on output.

JSON_VALUE on error, on empty semantics should be the same as json_query.
like:
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON ERROR ])

examples:
select JSON_value(jsonb '[]' , '$' empty array on error);
select JSON_value(jsonb '[]' , '$' empty object on error);

Again the documented behavior seems to make sense though and the ability to
specify empty in the value function seems like a bug. If you really want
an empty array or object you do have access to default. The reason
json_query provides for an empty array/object is that it is already
expecting to produce an array (object seems a little odd).

I agree our docs and code do not match which needs to be fixed, ideally in
the direction of the standard which I'm guessing our documentation is based
off of. But let's not go off of my guess.

David J.

#10Amit Langote
amitlangote09@gmail.com
In reply to: David G. Johnston (#9)
1 attachment(s)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Fri, Jun 21, 2024 at 9:47 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:

On Thu, Jun 20, 2024 at 9:01 AM jian he <jian.universality@gmail.com> wrote:

On Thu, Jun 20, 2024 at 5:46 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Thu, Jun 20, 2024 at 1:03 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:

On Wed, Jun 19, 2024 at 8:29 AM jian he <jian.universality@gmail.com> wrote:

On Mon, Jun 17, 2024 at 9:05 PM Chapman Flack <jcflack@acm.org> wrote:

Hi,

On 06/17/24 02:43, Amit Langote wrote:

<replaceable>context_item</replaceable> expression can be a value of
any type that can be cast to <type>jsonb</type>. This includes types
such as <type>char</type>, <type>text</type>, <type>bpchar</type>,
<type>character varying</type>, and <type>bytea</type> (with
<code>ENCODING UTF8</code>), as well as any domains over these types.

Reading this message in conjunction with [0] makes me think that we are
really talking about a function that takes a first parameter of type jsonb,
and behaves exactly that way (so any cast required is applied by the system
ahead of the call). Under those conditions, this seems like an unusual
sentence to add in the docs, at least until we have also documented that
tan's argument can be of any type that can be cast to double precision.

I guess it would be fine to add an unusual sentence to the docs.

imagine a function: array_avg(anyarray) returns anyelement.
array_avg calculate an array's elements's avg. like
array('{1,2,3}'::int[]) returns 2.
but array_avg won't make sense if the input argument is a date array.
so mentioning in the doc: array_avg can accept anyarray, but anyarray
cannot date array.
seems ok.

There is existing wording for this:

"The expression can be of any JSON type, any character string type, or bytea in UTF8 encoding."

If you add this sentence to the paragraph the link that already exists, which simply points the reader to this sentence, becomes redundant and should be removed.

I've just posted a patch in the other thread [1] to restrict
context_item to be of jsonb type, which users would need to ensure by
adding an explicit cast if needed. I think that makes this
clarification unnecessary.

As for table 9.16.3 - it is unwieldy already. Lets try and make the core syntax shorter, not longer. We already have precedence in the subsequent json_table section - give each major clause item a name then below the table define the syntax and meaning for those names. Unlike in that section - which probably should be modified too - context_item should have its own description line.

I had posted a patch a little while ago at [1] to render the syntax a
bit differently with each function getting its own syntax synopsis.
Resending it here; have addressed Jian He's comments.

--

I was thinking more like:

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c324906b22..b9d157663a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18692,8 +18692,10 @@ $.* ? (@ like_regex "^\\d+$")
<entry role="func_table_entry"><para role="func_signature">
<indexterm><primary>json_exists</primary></indexterm>
<function>json_exists</function> (
-        <replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable> <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
-        <optional> { <literal>TRUE</literal> | <literal>FALSE</literal> |<literal> UNKNOWN</literal> | <literal>ERROR</literal> } <literal>ON ERROR</literal> </optional>)
+        <replaceable>context_item</replaceable>,
+        <replaceable>path_expression</replaceable>
+        <optional>variable_definitions</optional>
+        <optional>on_error_boolean</optional>)
</para>
<para>
Returns true if the SQL/JSON <replaceable>path_expression</replaceable>
@@ -18732,12 +18734,14 @@ ERROR:  jsonpath array subscript is out of bounds
<entry role="func_table_entry"><para role="func_signature">
<indexterm><primary>json_query</primary></indexterm>
<function>json_query</function> (
-        <replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable> <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
-        <optional> <literal>RETURNING</literal> <replaceable>data_type</replaceable> <optional> <literal>FORMAT JSON</literal> <optional> <literal>ENCODING UTF8</literal> </optional> </optional> </optional>
-        <optional> { <literal>WITHOUT</literal> | <literal>WITH</literal> { <literal>CONDITIONAL</literal> | <optional><literal>UNCONDITIONAL</literal></optional> } } <optional> <literal>ARRAY</literal> </optional> <literal>WRAPPER</literal> </optional>
-        <optional> { <literal>KEEP</literal> | <literal>OMIT</literal> } <literal>QUOTES</literal> <optional> <literal>ON SCALAR STRING</literal> </optional> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>EMPTY</literal> { <optional> <literal>ARRAY</literal> </optional> | <literal>OBJECT</literal> } | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON EMPTY</literal> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>EMPTY</literal> { <optional> <literal>ARRAY</literal> </optional> | <literal>OBJECT</literal> } | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON ERROR</literal> </optional>)
+        <replaceable>context_item</replaceable>,
+        <replaceable>path_expression</replaceable>
+        <optional>variable_definitions</optional>
+        <optional>return_clause</optional>
+        <optional>wrapping_clause</optional>
+        <optional>quoting_clause</optional>
+        <optional>on_empty_set</optional>
+        <optional>on_error_set</optional>)
</para>
<para>
Returns the result of applying the SQL/JSON
@@ -18809,11 +18813,12 @@ DETAIL:  Missing "]" after array dimensions.
<entry role="func_table_entry"><para role="func_signature">
<indexterm><primary>json_value</primary></indexterm>
<function>json_value</function> (
-        <replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable>
-        <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional>
-        <optional> <literal>RETURNING</literal> <replaceable>data_type</replaceable> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON EMPTY</literal> </optional>
-        <optional> { <literal>ERROR</literal> | <literal>NULL</literal> | <literal>DEFAULT</literal> <replaceable>expression</replaceable> } <literal>ON ERROR</literal> </optional>)
+        <replaceable>context_item</replaceable>,
+        <replaceable>path_expression</replaceable>
+        <optional>variable_definitions</optional>
+        <optional>return_type</optional>
+        <optional>on_empty_value</optional>
+        <optional>on_error_value</optional>)
</para>
<para>
Returns the result of applying the SQL/JSON

Then defining each of those below the table - keeping the on_error variants together.

That sounds appealing. I'll try to come up with a patch unless you or
anyone else wants to take a stab at it.

playing around with it.
found some minor issues:

json_exists allow: DEFAULT expression ON ERROR, which is not
mentioned in the doc.
for example:
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default
true ON ERROR);
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 0 ON ERROR);
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 11 ON ERROR);

Yeah, surprised it works, the documented behavior seems logical. Being able to return a non-boolean here seems odd. Especially since it is cast to boolean on output.

JSON_VALUE on error, on empty semantics should be the same as json_query.
like:
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON ERROR ])

examples:
select JSON_value(jsonb '[]' , '$' empty array on error);
select JSON_value(jsonb '[]' , '$' empty object on error);

Again the documented behavior seems to make sense though and the ability to specify empty in the value function seems like a bug. If you really want an empty array or object you do have access to default. The reason json_query provides for an empty array/object is that it is already expecting to produce an array (object seems a little odd).

I agree our docs and code do not match which needs to be fixed, ideally in the direction of the standard which I'm guessing our documentation is based off of. But let's not go off of my guess.

Oops, that is indeed not great and, yes, the problem is code not
matching the documentation, the latter of which is "correct".

Basically, the grammar allows specifying any of the all possible ON
ERROR/EMPTY behavior values irrespective of the function, so parse
analysis should be catching and flagging an attempt to specify
incompatible value for a given function, which it does not.

I've attached a patch to fix that, which I'd like to push before
anything else we've been discussing.

--
Thanks, Amit Langote

Attachments:

v1-0001-SQL-JSON-Disallow-incompatible-values-in-ON-ERROR.patchapplication/octet-stream; name=v1-0001-SQL-JSON-Disallow-incompatible-values-in-ON-ERROR.patchDownload
From 081ffeddc12b2ae0fb3bb11907022326346e3805 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlan@postgresql.org>
Date: Fri, 21 Jun 2024 20:55:31 +0900
Subject: [PATCH v1] SQL/JSON: Disallow incompatible values in ON ERROR/EMPTY

Currently, the grammar allows specifying any of the all supported
values in the ON ERROR and ON EMPTY clause for each SQL/JSON function
that supports those clauses.  But the semantics of each of those
function allows only a subset of the values for any given function.
So, check during parse analysis that the provided value is valid for
the given function and throw a syntax error if not.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxFgWGqpESSYzyJ6tSurr3vFYBSNEmCfkGyB_dMdptFnZQ%40mail.gmail.com
---
 src/backend/parser/parse_expr.c               | 72 +++++++++++++++++--
 .../regress/expected/sqljson_jsontable.out    |  6 ++
 .../regress/expected/sqljson_queryfuncs.out   | 16 +++++
 src/test/regress/sql/sqljson_jsontable.sql    |  3 +
 src/test/regress/sql/sqljson_queryfuncs.sql   |  5 ++
 5 files changed, 96 insertions(+), 6 deletions(-)

diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 00cd7358eb..3d79d17125 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -4300,14 +4300,74 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
 	}
 
 	/* OMIT QUOTES is meaningless when strings are wrapped. */
-	if (func->op == JSON_QUERY_OP &&
-		func->quotes == JS_QUOTES_OMIT &&
-		(func->wrapper == JSW_CONDITIONAL ||
-		 func->wrapper == JSW_UNCONDITIONAL))
+	if (func->op == JSON_QUERY_OP)
+	{
+		if (func->quotes == JS_QUOTES_OMIT &&
+			(func->wrapper == JSW_CONDITIONAL ||
+			 func->wrapper == JSW_UNCONDITIONAL))
+			ereport(ERROR,
+					errcode(ERRCODE_SYNTAX_ERROR),
+					errmsg("SQL/JSON QUOTES behavior must not be specified when WITH WRAPPER is used"),
+					parser_errposition(pstate, func->location));
+		if (func->on_empty != NULL &&
+			func->on_empty->btype != JSON_BEHAVIOR_ERROR &&
+			func->on_empty->btype != JSON_BEHAVIOR_NULL &&
+			func->on_empty->btype != JSON_BEHAVIOR_EMPTY &&
+			func->on_empty->btype != JSON_BEHAVIOR_EMPTY_ARRAY &&
+			func->on_empty->btype != JSON_BEHAVIOR_EMPTY_OBJECT &&
+			func->on_empty->btype != JSON_BEHAVIOR_DEFAULT)
+			ereport(ERROR,
+					errcode(ERRCODE_SYNTAX_ERROR),
+					errmsg("invalid ON EMPTY behavior"),
+					errdetail("Only ERROR, NULL, EMPTY [ ARRAY | OBJECT }, or DEFAULT <value> is allowed in ON EMPTY for JSON_QUERY()."),
+					parser_errposition(pstate, func->on_empty->location));
+		if (func->on_error != NULL &&
+			func->on_error->btype != JSON_BEHAVIOR_ERROR &&
+			func->on_error->btype != JSON_BEHAVIOR_NULL &&
+			func->on_error->btype != JSON_BEHAVIOR_EMPTY &&
+			func->on_error->btype != JSON_BEHAVIOR_EMPTY_ARRAY &&
+			func->on_error->btype != JSON_BEHAVIOR_EMPTY_OBJECT &&
+			func->on_error->btype != JSON_BEHAVIOR_DEFAULT)
+			ereport(ERROR,
+					errcode(ERRCODE_SYNTAX_ERROR),
+					errmsg("invalid ON ERROR behavior"),
+					errdetail("Only ERROR, NULL, EMPTY [ ARRAY | OBJECT }, or DEFAULT <value> is allowed in ON ERROR for JSON_QUERY()."),
+					parser_errposition(pstate, func->on_error->location));
+	}
+
+	/* Check that ON ERROR/EMPTY behavior values are valid for the function. */
+	if (func->op == JSON_EXISTS_OP &&
+		func->on_error != NULL &&
+		func->on_error->btype != JSON_BEHAVIOR_ERROR &&
+		func->on_error->btype != JSON_BEHAVIOR_TRUE &&
+		func->on_error->btype != JSON_BEHAVIOR_FALSE &&
+		func->on_error->btype != JSON_BEHAVIOR_UNKNOWN)
 		ereport(ERROR,
 				errcode(ERRCODE_SYNTAX_ERROR),
-				errmsg("SQL/JSON QUOTES behavior must not be specified when WITH WRAPPER is used"),
-				parser_errposition(pstate, func->location));
+				errmsg("invalid ON ERROR behavior"),
+				errdetail("Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for JSON_EXISTS()."),
+				parser_errposition(pstate, func->on_error->location));
+	if (func->op == JSON_VALUE_OP)
+	{
+		if (func->on_empty != NULL &&
+			func->on_empty->btype != JSON_BEHAVIOR_ERROR &&
+			func->on_empty->btype != JSON_BEHAVIOR_NULL &&
+			func->on_empty->btype != JSON_BEHAVIOR_DEFAULT)
+			ereport(ERROR,
+					errcode(ERRCODE_SYNTAX_ERROR),
+					errmsg("invalid ON ERROR behavior"),
+					errdetail("Only ERROR, NULL, or DEFAULT <value> is allowed in ON EMPTY for JSON_VALUE()."),
+					parser_errposition(pstate, func->on_empty->location));
+		if (func->on_error != NULL &&
+			func->on_error->btype != JSON_BEHAVIOR_ERROR &&
+			 func->on_error->btype != JSON_BEHAVIOR_NULL &&
+			 func->on_error->btype != JSON_BEHAVIOR_DEFAULT)
+			ereport(ERROR,
+					errcode(ERRCODE_SYNTAX_ERROR),
+					errmsg("invalid ON EMPTY behavior"),
+					errdetail("Only ERROR, NULL, or DEFAULT <value> is allowed in ON ERROR for JSON_VALUE()."),
+					parser_errposition(pstate, func->on_error->location));
+	}
 
 	jsexpr = makeNode(JsonExpr);
 	jsexpr->location = func->location;
diff --git a/src/test/regress/expected/sqljson_jsontable.out b/src/test/regress/expected/sqljson_jsontable.out
index cee90cead1..eace29efbf 100644
--- a/src/test/regress/expected/sqljson_jsontable.out
+++ b/src/test/regress/expected/sqljson_jsontable.out
@@ -1067,3 +1067,9 @@ CREATE OR REPLACE VIEW public.jsonb_table_view7 AS
         ) sub
 DROP VIEW jsonb_table_view7;
 DROP TABLE s;
+-- Test ON ERROR / EMPTY value validity for the function
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int) NULL ON ERROR);	-- fail
+ERROR:  invalid ON ERROR behavior
+LINE 1: ... * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int) NULL ON ER...
+                                                             ^
+DETAIL:  Only EMPTY or ERROR is allowed in the top-level ON ERROR clause.
diff --git a/src/test/regress/expected/sqljson_queryfuncs.out b/src/test/regress/expected/sqljson_queryfuncs.out
index 9cb250a27a..0c4b62b0bb 100644
--- a/src/test/regress/expected/sqljson_queryfuncs.out
+++ b/src/test/regress/expected/sqljson_queryfuncs.out
@@ -1353,3 +1353,19 @@ SELECT JSON_QUERY(jsonb 'null', '$xyz' PASSING 1 AS xyz);
  1
 (1 row)
 
+-- Test ON ERROR / EMPTY value validity for the function; all fail.
+SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR);
+ERROR:  invalid ON ERROR behavior
+LINE 1: SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR);
+                                          ^
+DETAIL:  Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for JSON_EXISTS().
+SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
+ERROR:  invalid ON EMPTY behavior
+LINE 1: SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
+                                         ^
+DETAIL:  Only ERROR, NULL, or DEFAULT <value> is allowed in ON ERROR for JSON_VALUE().
+SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
+ERROR:  invalid ON ERROR behavior
+LINE 1: SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
+                                         ^
+DETAIL:  Only ERROR, NULL, EMPTY [ ARRAY | OBJECT }, or DEFAULT <value> is allowed in ON ERROR for JSON_QUERY().
diff --git a/src/test/regress/sql/sqljson_jsontable.sql b/src/test/regress/sql/sqljson_jsontable.sql
index a1f924146e..1f81464c5f 100644
--- a/src/test/regress/sql/sqljson_jsontable.sql
+++ b/src/test/regress/sql/sqljson_jsontable.sql
@@ -518,3 +518,6 @@ SELECT sub.* FROM s,
 \sv jsonb_table_view7
 DROP VIEW jsonb_table_view7;
 DROP TABLE s;
+
+-- Test ON ERROR / EMPTY value validity for the function
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int) NULL ON ERROR);	-- fail
diff --git a/src/test/regress/sql/sqljson_queryfuncs.sql b/src/test/regress/sql/sqljson_queryfuncs.sql
index dc6380141b..4586fdb8a4 100644
--- a/src/test/regress/sql/sqljson_queryfuncs.sql
+++ b/src/test/regress/sql/sqljson_queryfuncs.sql
@@ -459,3 +459,8 @@ SELECT json_value('"aaa"', path RETURNING json) FROM jsonpaths;
 SELECT JSON_QUERY(jsonb 'null', '$xyz' PASSING 1 AS xy);
 SELECT JSON_QUERY(jsonb 'null', '$xy' PASSING 1 AS xyz);
 SELECT JSON_QUERY(jsonb 'null', '$xyz' PASSING 1 AS xyz);
+
+-- Test ON ERROR / EMPTY value validity for the function; all fail.
+SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR);
+SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
+SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
-- 
2.43.0

#11Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#10)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Fri, Jun 21, 2024 at 9:18 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Fri, Jun 21, 2024 at 9:47 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:

On Thu, Jun 20, 2024 at 9:01 AM jian he <jian.universality@gmail.com> wrote:

playing around with it.
found some minor issues:

json_exists allow: DEFAULT expression ON ERROR, which is not
mentioned in the doc.
for example:
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default
true ON ERROR);
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 0 ON ERROR);
select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 11 ON ERROR);

Yeah, surprised it works, the documented behavior seems logical. Being able to return a non-boolean here seems odd. Especially since it is cast to boolean on output.

JSON_VALUE on error, on empty semantics should be the same as json_query.
like:
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON ERROR ])

examples:
select JSON_value(jsonb '[]' , '$' empty array on error);
select JSON_value(jsonb '[]' , '$' empty object on error);

Again the documented behavior seems to make sense though and the ability to specify empty in the value function seems like a bug. If you really want an empty array or object you do have access to default. The reason json_query provides for an empty array/object is that it is already expecting to produce an array (object seems a little odd).

I agree our docs and code do not match which needs to be fixed, ideally in the direction of the standard which I'm guessing our documentation is based off of. But let's not go off of my guess.

Oops, that is indeed not great and, yes, the problem is code not
matching the documentation, the latter of which is "correct".

Basically, the grammar allows specifying any of the all possible ON
ERROR/EMPTY behavior values irrespective of the function, so parse
analysis should be catching and flagging an attempt to specify
incompatible value for a given function, which it does not.

I've attached a patch to fix that, which I'd like to push before
anything else we've been discussing.

While there are still a few hours to go before Saturday noon UTC when
beta2 freeze goes into effect, I'm thinking to just push this after
beta2 is stamped. Adding an open item for now.

--
Thanks, Amit Langote

#12jian he
jian.universality@gmail.com
In reply to: Amit Langote (#10)
1 attachment(s)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Fri, Jun 21, 2024 at 8:18 PM Amit Langote <amitlangote09@gmail.com> wrote:

JSON_VALUE on error, on empty semantics should be the same as json_query.
like:
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON ERROR ])

examples:
select JSON_value(jsonb '[]' , '$' empty array on error);
select JSON_value(jsonb '[]' , '$' empty object on error);

Again the documented behavior seems to make sense though and the ability to specify empty in the value function seems like a bug. If you really want an empty array or object you do have access to default. The reason json_query provides for an empty array/object is that it is already expecting to produce an array (object seems a little odd).

I agree our docs and code do not match which needs to be fixed, ideally in the direction of the standard which I'm guessing our documentation is based off of. But let's not go off of my guess.

Oops, that is indeed not great and, yes, the problem is code not
matching the documentation, the latter of which is "correct".

Basically, the grammar allows specifying any of the all possible ON
ERROR/EMPTY behavior values irrespective of the function, so parse
analysis should be catching and flagging an attempt to specify
incompatible value for a given function, which it does not.

I've attached a patch to fix that, which I'd like to push before
anything else we've been discussing.

+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid ON ERROR behavior"),
+ errdetail("Only ERROR, NULL, EMPTY [ ARRAY | OBJECT }, or DEFAULT
<value> is allowed in ON ERROR for JSON_QUERY()."),
+ parser_errposition(pstate, func->on_error->location));

`EMPTY [ ARRAY | OBJECT }` seems not correct,
maybe just EMPTY, EMPTY ARRAY, EMPTY OBJECT.
(apply to other places)

`DEFAULT <value>`
`DEFAULT <expression>` or just `DEFAULT expression` would be more correct?
(apply to other places)

I think we should make json_query, json_value on empty, on error
behave the same way.
otherwise, it will have consistency issues for scalar jsonb.
for example, we should expect the following two queries to return the
same result?
SELECT * FROM JSON_query(jsonb '1', '$.a' returning jsonb empty on empty);
SELECT * FROM JSON_value(jsonb '1', '$.a' returning jsonb empty on empty);

Also the json_table function will call json_value or json_query,
make these two functions on error, on empty behavior the same can
reduce unintended complexity.

So based on your
patch(v1-0001-SQL-JSON-Disallow-incompatible-values-in-ON-ERROR.patch)
and the above points, I have made some changes, attached.
it will make json_value, json_query not allow {true | false | unknown
} on error, {true | false | unknown } on empty.
json_table error message deal the same way as
b4fad46b6bc8a9bf46ff689bcb1bd4edf8f267af

BTW,
i found one JSON_TABLE document deficiency
[ { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT
expression } ON EMPTY ]
[ { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT
expression } ON ERROR ]

it should be

[ { ERROR | NULL | EMPTY { [ARRAY] | OBJECT } | DEFAULT
expression } ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ARRAY] | OBJECT } | DEFAULT
expression } ON ERROR ]

Attachments:

v2-0002-Disallow-incompatible-values-in-ON-ERROR-EMPTY.patchapplication/x-patch; name=v2-0002-Disallow-incompatible-values-in-ON-ERROR-EMPTY.patchDownload
From 0a8b756a56f3750ad07f09d7933c9cae0218ec93 Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Sat, 22 Jun 2024 17:28:53 +0800
Subject: [PATCH v2 2/2] Disallow incompatible values in ON ERROR/EMPTY

make json_exists only allow
     { TRUE | FALSE | UNKNOWN | ERROR } ON ERROR

make json_query, json_value only allow
    { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression } ON EMPTY
    { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression } ON ERROR

make json_value, json_query on empty, on error behave the same way,
because for scalar jsonb, json_value, json_query behave the same.
so sometimes they may need to return the same value.

also json_table function may call json_value or json_query,
make these two function on error, on empty behavior the same can reduce unintended complex.
---
 src/backend/parser/parse_expr.c               | 143 ++++++++++--------
 .../regress/expected/sqljson_jsontable.out    |  30 ++++
 .../regress/expected/sqljson_queryfuncs.out   |  11 +-
 src/test/regress/sql/sqljson_jsontable.sql    |   7 +
 src/test/regress/sql/sqljson_queryfuncs.sql   |   2 +-
 5 files changed, 126 insertions(+), 67 deletions(-)

diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 3d79d171..f3ffa88f 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -4299,74 +4299,95 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
 					parser_errposition(pstate, format->location));
 	}
 
-	/* OMIT QUOTES is meaningless when strings are wrapped. */
+	/* check json_query, json_value, on error, on empty behavior */
+	if (func->op == JSON_QUERY_OP || func->op == JSON_VALUE_OP)
+	{
+		if (func->on_empty != NULL &&
+		   (func->on_empty->btype == JSON_BEHAVIOR_TRUE ||
+			func->on_empty->btype == JSON_BEHAVIOR_FALSE ||
+			func->on_empty->btype == JSON_BEHAVIOR_UNKNOWN))
+		{
+			if (func->column_name)
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON EMPTY behavior"),
+						errdetail("Only ERROR, NULL, EMPTY, EMPTY ARRAY, EMPTY OBJECT, or DEFAULT expression is allowed in ON EMPTY for column \"%s\".",
+								   func->column_name),
+						parser_errposition(pstate, func->on_empty->location));
+			else
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON EMPTY behavior"),
+						errdetail("Only ERROR, NULL, EMPTY, EMPTY ARRAY, EMPTY OBJECT, or DEFAULT expression is allowed in ON EMPTY for %s().",
+								   func_name),
+						parser_errposition(pstate, func->on_empty->location));
+		}
+
+		if (func->on_error != NULL &&
+		   (func->on_error->btype == JSON_BEHAVIOR_TRUE ||
+			func->on_error->btype == JSON_BEHAVIOR_FALSE ||
+			func->on_error->btype == JSON_BEHAVIOR_UNKNOWN))
+		{
+			if (func->column_name)
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON ERROR behavior"),
+						errdetail("Only ERROR, NULL, EMPTY, EMPTY ARRAY, EMPTY OBJECT, or DEFAULT expression is allowed in ON ERROR for column \"%s\".",
+								   func->column_name),
+						parser_errposition(pstate, func->on_error->location));
+			else
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON ERROR behavior"),
+						errdetail("Only ERROR, NULL, EMPTY, EMPTY ARRAY, EMPTY OBJECT, or DEFAULT expression is allowed in ON ERROR for %s().",
+								   func_name),
+						parser_errposition(pstate, func->on_error->location));
+		}
+	}
+
+	/* Check that ON ERROR/EMPTY behavior values are valid for json_exists. */
+	if (func->op == JSON_EXISTS_OP)
+	{
+		if (func->column_name && func->on_empty != NULL)
+			ereport(ERROR,
+					errcode(ERRCODE_SYNTAX_ERROR),
+					errmsg("invalid ON EMPTY behavior"),
+					errdetail("ON EMPTY is not allowed for column \"%s\".",
+							   func->column_name),
+					parser_errposition(pstate, func->on_empty->location));
+
+  		if (func->on_error != NULL &&
+			func->on_error->btype != JSON_BEHAVIOR_ERROR &&
+			func->on_error->btype != JSON_BEHAVIOR_TRUE &&
+			func->on_error->btype != JSON_BEHAVIOR_FALSE &&
+			func->on_error->btype != JSON_BEHAVIOR_UNKNOWN)
+		{
+			if (!func->column_name)
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON ERROR behavior"),
+						errdetail("Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for JSON_EXISTS()."),
+						parser_errposition(pstate, func->on_error->location));
+			else
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON ERROR behavior"),
+						errdetail("Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for column \"%s\".",
+								   func->column_name),
+						parser_errposition(pstate, func->on_error->location));
+		}
+	}
+
 	if (func->op == JSON_QUERY_OP)
 	{
+		/* OMIT QUOTES is meaningless when strings are wrapped. */
 		if (func->quotes == JS_QUOTES_OMIT &&
-			(func->wrapper == JSW_CONDITIONAL ||
-			 func->wrapper == JSW_UNCONDITIONAL))
+		   (func->wrapper == JSW_CONDITIONAL ||
+		 	func->wrapper == JSW_UNCONDITIONAL))
 			ereport(ERROR,
 					errcode(ERRCODE_SYNTAX_ERROR),
 					errmsg("SQL/JSON QUOTES behavior must not be specified when WITH WRAPPER is used"),
 					parser_errposition(pstate, func->location));
-		if (func->on_empty != NULL &&
-			func->on_empty->btype != JSON_BEHAVIOR_ERROR &&
-			func->on_empty->btype != JSON_BEHAVIOR_NULL &&
-			func->on_empty->btype != JSON_BEHAVIOR_EMPTY &&
-			func->on_empty->btype != JSON_BEHAVIOR_EMPTY_ARRAY &&
-			func->on_empty->btype != JSON_BEHAVIOR_EMPTY_OBJECT &&
-			func->on_empty->btype != JSON_BEHAVIOR_DEFAULT)
-			ereport(ERROR,
-					errcode(ERRCODE_SYNTAX_ERROR),
-					errmsg("invalid ON EMPTY behavior"),
-					errdetail("Only ERROR, NULL, EMPTY [ ARRAY | OBJECT }, or DEFAULT <value> is allowed in ON EMPTY for JSON_QUERY()."),
-					parser_errposition(pstate, func->on_empty->location));
-		if (func->on_error != NULL &&
-			func->on_error->btype != JSON_BEHAVIOR_ERROR &&
-			func->on_error->btype != JSON_BEHAVIOR_NULL &&
-			func->on_error->btype != JSON_BEHAVIOR_EMPTY &&
-			func->on_error->btype != JSON_BEHAVIOR_EMPTY_ARRAY &&
-			func->on_error->btype != JSON_BEHAVIOR_EMPTY_OBJECT &&
-			func->on_error->btype != JSON_BEHAVIOR_DEFAULT)
-			ereport(ERROR,
-					errcode(ERRCODE_SYNTAX_ERROR),
-					errmsg("invalid ON ERROR behavior"),
-					errdetail("Only ERROR, NULL, EMPTY [ ARRAY | OBJECT }, or DEFAULT <value> is allowed in ON ERROR for JSON_QUERY()."),
-					parser_errposition(pstate, func->on_error->location));
-	}
-
-	/* Check that ON ERROR/EMPTY behavior values are valid for the function. */
-	if (func->op == JSON_EXISTS_OP &&
-		func->on_error != NULL &&
-		func->on_error->btype != JSON_BEHAVIOR_ERROR &&
-		func->on_error->btype != JSON_BEHAVIOR_TRUE &&
-		func->on_error->btype != JSON_BEHAVIOR_FALSE &&
-		func->on_error->btype != JSON_BEHAVIOR_UNKNOWN)
-		ereport(ERROR,
-				errcode(ERRCODE_SYNTAX_ERROR),
-				errmsg("invalid ON ERROR behavior"),
-				errdetail("Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for JSON_EXISTS()."),
-				parser_errposition(pstate, func->on_error->location));
-	if (func->op == JSON_VALUE_OP)
-	{
-		if (func->on_empty != NULL &&
-			func->on_empty->btype != JSON_BEHAVIOR_ERROR &&
-			func->on_empty->btype != JSON_BEHAVIOR_NULL &&
-			func->on_empty->btype != JSON_BEHAVIOR_DEFAULT)
-			ereport(ERROR,
-					errcode(ERRCODE_SYNTAX_ERROR),
-					errmsg("invalid ON ERROR behavior"),
-					errdetail("Only ERROR, NULL, or DEFAULT <value> is allowed in ON EMPTY for JSON_VALUE()."),
-					parser_errposition(pstate, func->on_empty->location));
-		if (func->on_error != NULL &&
-			func->on_error->btype != JSON_BEHAVIOR_ERROR &&
-			 func->on_error->btype != JSON_BEHAVIOR_NULL &&
-			 func->on_error->btype != JSON_BEHAVIOR_DEFAULT)
-			ereport(ERROR,
-					errcode(ERRCODE_SYNTAX_ERROR),
-					errmsg("invalid ON EMPTY behavior"),
-					errdetail("Only ERROR, NULL, or DEFAULT <value> is allowed in ON ERROR for JSON_VALUE()."),
-					parser_errposition(pstate, func->on_error->location));
 	}
 
 	jsexpr = makeNode(JsonExpr);
diff --git a/src/test/regress/expected/sqljson_jsontable.out b/src/test/regress/expected/sqljson_jsontable.out
index eace29ef..8d372764 100644
--- a/src/test/regress/expected/sqljson_jsontable.out
+++ b/src/test/regress/expected/sqljson_jsontable.out
@@ -1073,3 +1073,33 @@ ERROR:  invalid ON ERROR behavior
 LINE 1: ... * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int) NULL ON ER...
                                                              ^
 DETAIL:  Only EMPTY or ERROR is allowed in the top-level ON ERROR clause.
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int true on empty));
+ERROR:  invalid ON EMPTY behavior
+LINE 1: ...T * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int true on em...
+                                                             ^
+DETAIL:  Only ERROR, NULL, EMPTY, EMPTY ARRAY, EMPTY OBJECT, or DEFAULT expression is allowed in ON EMPTY for column "a".
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int omit quotes true on error));
+ERROR:  invalid ON ERROR behavior
+LINE 1: ...N_TABLE(jsonb '1', '$' COLUMNS (a int omit quotes true on er...
+                                                             ^
+DETAIL:  Only ERROR, NULL, EMPTY, EMPTY ARRAY, EMPTY OBJECT, or DEFAULT expression is allowed in ON ERROR for column "a".
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int keep quotes empty on error));
+ERROR:  cannot cast jsonb array to type integer
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int keep quotes empty object on error));
+ERROR:  cannot cast jsonb object to type integer
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a jsonb keep quotes empty object on empty)); --ok
+ a  
+----
+ {}
+(1 row)
+
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists empty object on error));
+ERROR:  invalid ON ERROR behavior
+LINE 1: ...M JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists empty obje...
+                                                             ^
+DETAIL:  Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for column "a".
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists true on empty));
+ERROR:  invalid ON EMPTY behavior
+LINE 1: ...M JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists true on em...
+                                                             ^
+DETAIL:  ON EMPTY is not allowed for column "a".
diff --git a/src/test/regress/expected/sqljson_queryfuncs.out b/src/test/regress/expected/sqljson_queryfuncs.out
index 0c4b62b0..37bb5ad7 100644
--- a/src/test/regress/expected/sqljson_queryfuncs.out
+++ b/src/test/regress/expected/sqljson_queryfuncs.out
@@ -1360,12 +1360,13 @@ LINE 1: SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR);
                                           ^
 DETAIL:  Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for JSON_EXISTS().
 SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
-ERROR:  invalid ON EMPTY behavior
-LINE 1: SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
-                                         ^
-DETAIL:  Only ERROR, NULL, or DEFAULT <value> is allowed in ON ERROR for JSON_VALUE().
+ json_value 
+------------
+ 1
+(1 row)
+
 SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
 ERROR:  invalid ON ERROR behavior
 LINE 1: SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
                                          ^
-DETAIL:  Only ERROR, NULL, EMPTY [ ARRAY | OBJECT }, or DEFAULT <value> is allowed in ON ERROR for JSON_QUERY().
+DETAIL:  Only ERROR, NULL, EMPTY, EMPTY ARRAY, EMPTY OBJECT, or DEFAULT expression is allowed in ON ERROR for JSON_QUERY().
diff --git a/src/test/regress/sql/sqljson_jsontable.sql b/src/test/regress/sql/sqljson_jsontable.sql
index 1f81464c..3bb49ba9 100644
--- a/src/test/regress/sql/sqljson_jsontable.sql
+++ b/src/test/regress/sql/sqljson_jsontable.sql
@@ -521,3 +521,10 @@ DROP TABLE s;
 
 -- Test ON ERROR / EMPTY value validity for the function
 SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int) NULL ON ERROR);	-- fail
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int true on empty));
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int omit quotes true on error));
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int keep quotes empty on error));
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int keep quotes empty object on error));
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a jsonb keep quotes empty object on empty)); --ok
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists empty object on error));
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists true on empty));
diff --git a/src/test/regress/sql/sqljson_queryfuncs.sql b/src/test/regress/sql/sqljson_queryfuncs.sql
index 4586fdb8..0f7b6ec7 100644
--- a/src/test/regress/sql/sqljson_queryfuncs.sql
+++ b/src/test/regress/sql/sqljson_queryfuncs.sql
@@ -462,5 +462,5 @@ SELECT JSON_QUERY(jsonb 'null', '$xyz' PASSING 1 AS xyz);
 
 -- Test ON ERROR / EMPTY value validity for the function; all fail.
 SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR);
-SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
+SELECT JSON_VALUE(jsonb '1', '$' UNKNOWN ON ERROR);
 SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
-- 
2.34.1

#13Amit Langote
amitlangote09@gmail.com
In reply to: jian he (#12)
2 attachment(s)
Re: SQL/JSON query functions context_item doc entry and type requirement

Hi,

On Sat, Jun 22, 2024 at 6:39 PM jian he <jian.universality@gmail.com> wrote:

On Fri, Jun 21, 2024 at 8:18 PM Amit Langote <amitlangote09@gmail.com> wrote:

JSON_VALUE on error, on empty semantics should be the same as json_query.
like:
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON ERROR ])

examples:
select JSON_value(jsonb '[]' , '$' empty array on error);
select JSON_value(jsonb '[]' , '$' empty object on error);

Again the documented behavior seems to make sense though and the ability to specify empty in the value function seems like a bug. If you really want an empty array or object you do have access to default. The reason json_query provides for an empty array/object is that it is already expecting to produce an array (object seems a little odd).

I agree our docs and code do not match which needs to be fixed, ideally in the direction of the standard which I'm guessing our documentation is based off of. But let's not go off of my guess.

Oops, that is indeed not great and, yes, the problem is code not
matching the documentation, the latter of which is "correct".

Basically, the grammar allows specifying any of the all possible ON
ERROR/EMPTY behavior values irrespective of the function, so parse
analysis should be catching and flagging an attempt to specify
incompatible value for a given function, which it does not.

I've attached a patch to fix that, which I'd like to push before
anything else we've been discussing.

+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid ON ERROR behavior"),
+ errdetail("Only ERROR, NULL, EMPTY [ ARRAY | OBJECT }, or DEFAULT
<value> is allowed in ON ERROR for JSON_QUERY()."),
+ parser_errposition(pstate, func->on_error->location));

`EMPTY [ ARRAY | OBJECT }` seems not correct,
maybe just EMPTY, EMPTY ARRAY, EMPTY OBJECT.
(apply to other places)

Or EMPTY [ ARRAY ], EMPTY OBJECT

`DEFAULT <value>`
`DEFAULT <expression>` or just `DEFAULT expression` would be more correct?
(apply to other places)

"DEFAULT expression" sounds good.

I think we should make json_query, json_value on empty, on error
behave the same way.
otherwise, it will have consistency issues for scalar jsonb.
for example, we should expect the following two queries to return the
same result?
SELECT * FROM JSON_query(jsonb '1', '$.a' returning jsonb empty on empty);
SELECT * FROM JSON_value(jsonb '1', '$.a' returning jsonb empty on empty);

Also the json_table function will call json_value or json_query,
make these two functions on error, on empty behavior the same can
reduce unintended complexity.

So based on your
patch(v1-0001-SQL-JSON-Disallow-incompatible-values-in-ON-ERROR.patch)
and the above points, I have made some changes, attached.
it will make json_value, json_query not allow {true | false | unknown
} on error, {true | false | unknown } on empty.
json_table error message deal the same way as
b4fad46b6bc8a9bf46ff689bcb1bd4edf8f267af

Here is an updated patch that I think takes care of these points.

BTW,
i found one JSON_TABLE document deficiency
[ { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT
expression } ON EMPTY ]
[ { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT
expression } ON ERROR ]

it should be

[ { ERROR | NULL | EMPTY { [ARRAY] | OBJECT } | DEFAULT
expression } ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ARRAY] | OBJECT } | DEFAULT
expression } ON ERROR ]

You're right. Fixed.

Also, I noticed that the grammar allows ON EMPTY in JSON_TABLE EXISTS
columns which is meaningless because JSON_EXISTS() doesn't have a
corresponding ON EMPTY clause. Fixed grammar to prevent that in the
attached 0002.

--
Thanks, Amit Langote

Attachments:

v2-0002-SQL-JSON-Prevent-ON-EMPTY-for-EXISTS-columns-in-J.patchapplication/octet-stream; name=v2-0002-SQL-JSON-Prevent-ON-EMPTY-for-EXISTS-columns-in-J.patchDownload
From 6cfe8076e63197294780fb1b935a98dd48d91572 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlan@postgresql.org>
Date: Wed, 26 Jun 2024 22:58:20 +0900
Subject: [PATCH v2 2/2] SQL/JSON: Prevent ON EMPTY for EXISTS columns in
 JSON_TABLE()

Fix the grammar production to that end.
---
 src/backend/parser/gram.y                       | 6 +++---
 src/test/regress/expected/sqljson_jsontable.out | 5 +++++
 src/test/regress/sql/sqljson_jsontable.sql      | 3 +++
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 4a4b47ca50..a043fd4c66 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -14265,7 +14265,7 @@ json_table_column_definition:
 				}
 			| ColId Typename
 				EXISTS json_table_column_path_clause_opt
-				json_behavior_clause_opt
+				json_on_error_clause_opt
 				{
 					JsonTableColumn *n = makeNode(JsonTableColumn);
 
@@ -14276,8 +14276,8 @@ json_table_column_definition:
 					n->wrapper = JSW_NONE;
 					n->quotes = JS_QUOTES_UNSPEC;
 					n->pathspec = (JsonTablePathSpec *) $4;
-					n->on_empty = (JsonBehavior *) linitial($5);
-					n->on_error = (JsonBehavior *) lsecond($5);
+					n->on_empty = NULL;
+					n->on_error = (JsonBehavior *) $5;
 					n->location = @1;
 					$$ = (Node *) n;
 				}
diff --git a/src/test/regress/expected/sqljson_jsontable.out b/src/test/regress/expected/sqljson_jsontable.out
index b1da088785..3a5dd51e30 100644
--- a/src/test/regress/expected/sqljson_jsontable.out
+++ b/src/test/regress/expected/sqljson_jsontable.out
@@ -1078,3 +1078,8 @@ ERROR:  invalid ON ERROR behavior for column "a"
 LINE 1: ...M JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists empty obje...
                                                              ^
 DETAIL:  Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for EXISTS columns.
+-- Prevent ON EMPTY specification on EXISTS columns
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists empty object on empty));
+ERROR:  syntax error at or near "empty"
+LINE 1: ...sonb '1', '$' COLUMNS (a int exists empty object on empty));
+                                                               ^
diff --git a/src/test/regress/sql/sqljson_jsontable.sql b/src/test/regress/sql/sqljson_jsontable.sql
index 4562f08aed..36eadd9648 100644
--- a/src/test/regress/sql/sqljson_jsontable.sql
+++ b/src/test/regress/sql/sqljson_jsontable.sql
@@ -524,3 +524,6 @@ SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int) NULL ON ERROR);	-- fail
 SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int true on empty));
 SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int omit quotes true on error));
 SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists empty object on error));
+
+-- Prevent ON EMPTY specification on EXISTS columns
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists empty object on empty));
-- 
2.43.0

v2-0001-SQL-JSON-Disallow-incompatible-values-in-ON-ERROR.patchapplication/octet-stream; name=v2-0001-SQL-JSON-Disallow-incompatible-values-in-ON-ERROR.patchDownload
From e4a7291c41019f8b8dcf49e851c7d7805057ed67 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlan@postgresql.org>
Date: Fri, 21 Jun 2024 20:55:31 +0900
Subject: [PATCH v2 1/2] SQL/JSON: Disallow incompatible values in ON
 ERROR/EMPTY

Currently, the grammar allows specifying any of the all supported
values in the ON ERROR and ON EMPTY clause for each SQL/JSON function
that supports those clauses.  But the semantics of each of those
function allows only a subset of the values for any given function.
So, check during parse analysis that the provided value is valid for
the given function and throw a syntax error if not.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxFgWGqpESSYzyJ6tSurr3vFYBSNEmCfkGyB_dMdptFnZQ%40mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   6 +-
 src/backend/parser/parse_expr.c               | 126 ++++++++++++++++--
 src/backend/parser/parse_jsontable.c          |   2 +-
 .../regress/expected/sqljson_jsontable.out    |  25 +++-
 .../regress/expected/sqljson_queryfuncs.out   |  16 +++
 src/test/regress/sql/sqljson_jsontable.sql    |   6 +
 src/test/regress/sql/sqljson_queryfuncs.sql   |   5 +
 7 files changed, 172 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2609269610..bc54fc2a40 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18929,7 +18929,7 @@ DETAIL:  Missing "]" after array dimensions.
 JSON_TABLE (
     <replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable> <optional> AS <replaceable>json_path_name</replaceable> </optional> <optional> PASSING { <replaceable>value</replaceable> AS <replaceable>varname</replaceable> } <optional>, ...</optional> </optional>
     COLUMNS ( <replaceable class="parameter">json_table_column</replaceable> <optional>, ...</optional> )
-    <optional> { <literal>ERROR</literal> | <literal>EMPTY</literal> } <literal>ON ERROR</literal> </optional>
+    <optional> { <literal>ERROR</literal> | <literal>EMPTY</literal> <optional>ARRAY</optional>} <literal>ON ERROR</literal> </optional>
 )
 
 <phrase>
@@ -18941,8 +18941,8 @@ where <replaceable class="parameter">json_table_column</replaceable> is:
         <optional> PATH <replaceable>path_expression</replaceable> </optional>
         <optional> { WITHOUT | WITH { CONDITIONAL | <optional>UNCONDITIONAL</optional> } } <optional> ARRAY </optional> WRAPPER </optional>
         <optional> { KEEP | OMIT } QUOTES <optional> ON SCALAR STRING </optional> </optional>
-        <optional> { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT <replaceable>expression</replaceable> } ON EMPTY </optional>
-        <optional> { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT <replaceable>expression</replaceable> } ON ERROR </optional>
+        <optional> { ERROR | NULL | EMPTY { <optional>ARRAY</optional> | OBJECT } | DEFAULT <replaceable>expression</replaceable> } ON EMPTY </optional>
+        <optional> { ERROR | NULL | EMPTY { <optional>ARRAY</optional> | OBJECT } | DEFAULT <replaceable>expression</replaceable> } ON ERROR </optional>
   | <replaceable>name</replaceable> <replaceable>type</replaceable> EXISTS <optional> PATH <replaceable>path_expression</replaceable> </optional>
         <optional> { ERROR | TRUE | FALSE | UNKNOWN } ON ERROR </optional>
   | NESTED <optional> PATH </optional> <replaceable>path_expression</replaceable> <optional> AS <replaceable>json_path_name</replaceable> </optional> COLUMNS ( <replaceable>json_table_column</replaceable> <optional>, ...</optional> )
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 699d0be751..5b576be172 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -4321,14 +4321,124 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
 	}
 
 	/* OMIT QUOTES is meaningless when strings are wrapped. */
-	if (func->op == JSON_QUERY_OP &&
-		func->quotes == JS_QUOTES_OMIT &&
-		(func->wrapper == JSW_CONDITIONAL ||
-		 func->wrapper == JSW_UNCONDITIONAL))
-		ereport(ERROR,
-				errcode(ERRCODE_SYNTAX_ERROR),
-				errmsg("SQL/JSON QUOTES behavior must not be specified when WITH WRAPPER is used"),
-				parser_errposition(pstate, func->location));
+	if (func->op == JSON_QUERY_OP)
+	{
+		if (func->quotes == JS_QUOTES_OMIT &&
+			(func->wrapper == JSW_CONDITIONAL ||
+			 func->wrapper == JSW_UNCONDITIONAL))
+			ereport(ERROR,
+					errcode(ERRCODE_SYNTAX_ERROR),
+					errmsg("SQL/JSON QUOTES behavior must not be specified when WITH WRAPPER is used"),
+					parser_errposition(pstate, func->location));
+		if (func->on_empty != NULL &&
+			func->on_empty->btype != JSON_BEHAVIOR_ERROR &&
+			func->on_empty->btype != JSON_BEHAVIOR_NULL &&
+			func->on_empty->btype != JSON_BEHAVIOR_EMPTY &&
+			func->on_empty->btype != JSON_BEHAVIOR_EMPTY_ARRAY &&
+			func->on_empty->btype != JSON_BEHAVIOR_EMPTY_OBJECT &&
+			func->on_empty->btype != JSON_BEHAVIOR_DEFAULT)
+		{
+			if (func->column_name == NULL)
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON EMPTY behavior"),
+						errdetail("Only ERROR, NULL, EMPTY [ ARRAY ], EMPTY OBJECT, or DEFAULT expression is allowed in ON EMPTY for JSON_QUERY()."),
+						parser_errposition(pstate, func->on_empty->location));
+			else
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON EMPTY behavior for column \"%s\"",
+								func->column_name),
+						errdetail("Only ERROR, NULL, EMPTY [ ARRAY ], EMPTY OBJECT, or DEFAULT expression is allowed in ON EMPTY for formatted columns."),
+						parser_errposition(pstate, func->on_empty->location));
+		}
+		if (func->on_error != NULL &&
+			func->on_error->btype != JSON_BEHAVIOR_ERROR &&
+			func->on_error->btype != JSON_BEHAVIOR_NULL &&
+			func->on_error->btype != JSON_BEHAVIOR_EMPTY &&
+			func->on_error->btype != JSON_BEHAVIOR_EMPTY_ARRAY &&
+			func->on_error->btype != JSON_BEHAVIOR_EMPTY_OBJECT &&
+			func->on_error->btype != JSON_BEHAVIOR_DEFAULT)
+		{
+			if (func->column_name == NULL)
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON ERROR behavior"),
+						errdetail("Only ERROR, NULL, EMPTY [ ARRAY ], EMPTY OBJECT, or DEFAULT expression is allowed in ON ERROR for JSON_QUERY()."),
+						parser_errposition(pstate, func->on_error->location));
+			else
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON ERROR behavior for column \"%s\"",
+								func->column_name),
+						errdetail("Only ERROR, NULL, EMPTY [ ARRAY ], EMPTY OBJECT, or DEFAULT expression is allowed in ON ERROR for formatted columns."),
+						parser_errposition(pstate, func->on_error->location));
+		}
+	}
+
+	/* Check that ON ERROR/EMPTY behavior values are valid for the function. */
+	if (func->op == JSON_EXISTS_OP &&
+		func->on_error != NULL &&
+		func->on_error->btype != JSON_BEHAVIOR_ERROR &&
+		func->on_error->btype != JSON_BEHAVIOR_TRUE &&
+		func->on_error->btype != JSON_BEHAVIOR_FALSE &&
+		func->on_error->btype != JSON_BEHAVIOR_UNKNOWN)
+	{
+		if (func->column_name == NULL)
+			ereport(ERROR,
+					errcode(ERRCODE_SYNTAX_ERROR),
+					errmsg("invalid ON ERROR behavior"),
+					errdetail("Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for JSON_EXISTS()."),
+					parser_errposition(pstate, func->on_error->location));
+		else
+			ereport(ERROR,
+					errcode(ERRCODE_SYNTAX_ERROR),
+					errmsg("invalid ON ERROR behavior for column \"%s\"",
+							func->column_name),
+					errdetail("Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for EXISTS columns."),
+					parser_errposition(pstate, func->on_error->location));
+	}
+	if (func->op == JSON_VALUE_OP)
+	{
+		if (func->on_empty != NULL &&
+			func->on_empty->btype != JSON_BEHAVIOR_ERROR &&
+			func->on_empty->btype != JSON_BEHAVIOR_NULL &&
+			func->on_empty->btype != JSON_BEHAVIOR_DEFAULT)
+		{
+			if (func->column_name == NULL)
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON EMPTY behavior"),
+						errdetail("Only ERROR, NULL, or DEFAULT expression is allowed in ON EMPTY for JSON_VALUE()."),
+						parser_errposition(pstate, func->on_empty->location));
+			else
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON EMPTY behavior for column \"%s\"",
+								func->column_name),
+						errdetail("Only ERROR, NULL, or DEFAULT expression is allowed in ON EMPTY for scalar columns."),
+						parser_errposition(pstate, func->on_empty->location));
+		}
+		if (func->on_error != NULL &&
+			func->on_error->btype != JSON_BEHAVIOR_ERROR &&
+			 func->on_error->btype != JSON_BEHAVIOR_NULL &&
+			 func->on_error->btype != JSON_BEHAVIOR_DEFAULT)
+		{
+			if (func->column_name == NULL)
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON ERROR behavior"),
+						errdetail("Only ERROR, NULL, or DEFAULT expression is allowed in ON ERROR for JSON_VALUE()."),
+						parser_errposition(pstate, func->on_error->location));
+			else
+				ereport(ERROR,
+						errcode(ERRCODE_SYNTAX_ERROR),
+						errmsg("invalid ON ERROR behavior for column \"%s\"",
+								func->column_name),
+						errdetail("Only ERROR, NULL, or DEFAULT expression is allowed in ON ERROR for scalar columns."),
+						parser_errposition(pstate, func->on_error->location));
+		}
+	}
 
 	jsexpr = makeNode(JsonExpr);
 	jsexpr->location = func->location;
diff --git a/src/backend/parser/parse_jsontable.c b/src/backend/parser/parse_jsontable.c
index b2519c2f32..8a72e498e8 100644
--- a/src/backend/parser/parse_jsontable.c
+++ b/src/backend/parser/parse_jsontable.c
@@ -92,7 +92,7 @@ transformJsonTable(ParseState *pstate, JsonTable *jt)
 		ereport(ERROR,
 				errcode(ERRCODE_SYNTAX_ERROR),
 				errmsg("invalid ON ERROR behavior"),
-				errdetail("Only EMPTY or ERROR is allowed in the top-level ON ERROR clause."),
+				errdetail("Only EMPTY [ ARRAY ] or ERROR is allowed in the top-level ON ERROR clause."),
 				parser_errposition(pstate, jt->on_error->location));
 
 	cxt.pathNameId = 0;
diff --git a/src/test/regress/expected/sqljson_jsontable.out b/src/test/regress/expected/sqljson_jsontable.out
index cf43442891..b1da088785 100644
--- a/src/test/regress/expected/sqljson_jsontable.out
+++ b/src/test/regress/expected/sqljson_jsontable.out
@@ -9,12 +9,12 @@ SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') DEFAULT 1
 ERROR:  invalid ON ERROR behavior
 LINE 1: ...BLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') DEFAULT 1 ...
                                                              ^
-DETAIL:  Only EMPTY or ERROR is allowed in the top-level ON ERROR clause.
+DETAIL:  Only EMPTY [ ARRAY ] or ERROR is allowed in the top-level ON ERROR clause.
 SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') NULL ON ERROR);
 ERROR:  invalid ON ERROR behavior
 LINE 1: ...BLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') NULL ON ER...
                                                              ^
-DETAIL:  Only EMPTY or ERROR is allowed in the top-level ON ERROR clause.
+DETAIL:  Only EMPTY [ ARRAY ] or ERROR is allowed in the top-level ON ERROR clause.
 SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 int PATH '$') EMPTY ON ERROR);
  js2 
 -----
@@ -1057,3 +1057,24 @@ CREATE OR REPLACE VIEW public.jsonb_table_view7 AS
         ) sub
 DROP VIEW jsonb_table_view7;
 DROP TABLE s;
+-- Test ON ERROR / EMPTY value validity for the function and column types; all fail
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int) NULL ON ERROR);	-- fail
+ERROR:  invalid ON ERROR behavior
+LINE 1: ... * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int) NULL ON ER...
+                                                             ^
+DETAIL:  Only EMPTY [ ARRAY ] or ERROR is allowed in the top-level ON ERROR clause.
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int true on empty));
+ERROR:  invalid ON EMPTY behavior for column "a"
+LINE 1: ...T * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int true on em...
+                                                             ^
+DETAIL:  Only ERROR, NULL, or DEFAULT expression is allowed in ON EMPTY for scalar columns.
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int omit quotes true on error));
+ERROR:  invalid ON ERROR behavior for column "a"
+LINE 1: ...N_TABLE(jsonb '1', '$' COLUMNS (a int omit quotes true on er...
+                                                             ^
+DETAIL:  Only ERROR, NULL, EMPTY [ ARRAY ], EMPTY OBJECT, or DEFAULT expression is allowed in ON ERROR for formatted columns.
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists empty object on error));
+ERROR:  invalid ON ERROR behavior for column "a"
+LINE 1: ...M JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists empty obje...
+                                                             ^
+DETAIL:  Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for EXISTS columns.
diff --git a/src/test/regress/expected/sqljson_queryfuncs.out b/src/test/regress/expected/sqljson_queryfuncs.out
index ea136eefa7..a8a2ea7204 100644
--- a/src/test/regress/expected/sqljson_queryfuncs.out
+++ b/src/test/regress/expected/sqljson_queryfuncs.out
@@ -1389,3 +1389,19 @@ SELECT JSON_VALUE(jsonb '123', '$' RETURNING queryfuncs_char2_chk DEFAULT 1 ON E
 (1 row)
 
 DROP DOMAIN queryfuncs_char2, queryfuncs_char2_chk;
+-- Test ON ERROR / EMPTY value validity for the function; all fail.
+SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR);
+ERROR:  invalid ON ERROR behavior
+LINE 1: SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR);
+                                          ^
+DETAIL:  Only ERROR, TRUE, FALSE, or UNKNOWN is allowed in ON ERROR for JSON_EXISTS().
+SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
+ERROR:  invalid ON ERROR behavior
+LINE 1: SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
+                                         ^
+DETAIL:  Only ERROR, NULL, or DEFAULT expression is allowed in ON ERROR for JSON_VALUE().
+SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
+ERROR:  invalid ON ERROR behavior
+LINE 1: SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
+                                         ^
+DETAIL:  Only ERROR, NULL, EMPTY [ ARRAY ], EMPTY OBJECT, or DEFAULT expression is allowed in ON ERROR for JSON_QUERY().
diff --git a/src/test/regress/sql/sqljson_jsontable.sql b/src/test/regress/sql/sqljson_jsontable.sql
index 70bff37a6d..4562f08aed 100644
--- a/src/test/regress/sql/sqljson_jsontable.sql
+++ b/src/test/regress/sql/sqljson_jsontable.sql
@@ -518,3 +518,9 @@ SELECT sub.* FROM s,
 \sv jsonb_table_view7
 DROP VIEW jsonb_table_view7;
 DROP TABLE s;
+
+-- Test ON ERROR / EMPTY value validity for the function and column types; all fail
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int) NULL ON ERROR);	-- fail
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int true on empty));
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int omit quotes true on error));
+SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int exists empty object on error));
diff --git a/src/test/regress/sql/sqljson_queryfuncs.sql b/src/test/regress/sql/sqljson_queryfuncs.sql
index 0d6482f384..de2c043af0 100644
--- a/src/test/regress/sql/sqljson_queryfuncs.sql
+++ b/src/test/regress/sql/sqljson_queryfuncs.sql
@@ -474,3 +474,8 @@ SELECT JSON_VALUE(jsonb '123', '$' RETURNING queryfuncs_char2 DEFAULT 1 ON ERROR
 SELECT JSON_VALUE(jsonb '123', '$' RETURNING queryfuncs_char2_chk ERROR ON ERROR);
 SELECT JSON_VALUE(jsonb '123', '$' RETURNING queryfuncs_char2_chk DEFAULT 1 ON ERROR);
 DROP DOMAIN queryfuncs_char2, queryfuncs_char2_chk;
+
+-- Test ON ERROR / EMPTY value validity for the function; all fail.
+SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR);
+SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
+SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
-- 
2.43.0

#14Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#13)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Thu, Jun 27, 2024 at 9:01 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Sat, Jun 22, 2024 at 6:39 PM jian he <jian.universality@gmail.com> wrote:

On Fri, Jun 21, 2024 at 8:18 PM Amit Langote <amitlangote09@gmail.com> wrote:

JSON_VALUE on error, on empty semantics should be the same as json_query.
like:
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON ERROR ])

examples:
select JSON_value(jsonb '[]' , '$' empty array on error);
select JSON_value(jsonb '[]' , '$' empty object on error);

Again the documented behavior seems to make sense though and the ability to specify empty in the value function seems like a bug. If you really want an empty array or object you do have access to default. The reason json_query provides for an empty array/object is that it is already expecting to produce an array (object seems a little odd).

I agree our docs and code do not match which needs to be fixed, ideally in the direction of the standard which I'm guessing our documentation is based off of. But let's not go off of my guess.

Oops, that is indeed not great and, yes, the problem is code not
matching the documentation, the latter of which is "correct".

Basically, the grammar allows specifying any of the all possible ON
ERROR/EMPTY behavior values irrespective of the function, so parse
analysis should be catching and flagging an attempt to specify
incompatible value for a given function, which it does not.

I've attached a patch to fix that, which I'd like to push before
anything else we've been discussing.

+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid ON ERROR behavior"),
+ errdetail("Only ERROR, NULL, EMPTY [ ARRAY | OBJECT }, or DEFAULT
<value> is allowed in ON ERROR for JSON_QUERY()."),
+ parser_errposition(pstate, func->on_error->location));

`EMPTY [ ARRAY | OBJECT }` seems not correct,
maybe just EMPTY, EMPTY ARRAY, EMPTY OBJECT.
(apply to other places)

Or EMPTY [ ARRAY ], EMPTY OBJECT

`DEFAULT <value>`
`DEFAULT <expression>` or just `DEFAULT expression` would be more correct?
(apply to other places)

"DEFAULT expression" sounds good.

I think we should make json_query, json_value on empty, on error
behave the same way.
otherwise, it will have consistency issues for scalar jsonb.
for example, we should expect the following two queries to return the
same result?
SELECT * FROM JSON_query(jsonb '1', '$.a' returning jsonb empty on empty);
SELECT * FROM JSON_value(jsonb '1', '$.a' returning jsonb empty on empty);

Also the json_table function will call json_value or json_query,
make these two functions on error, on empty behavior the same can
reduce unintended complexity.

So based on your
patch(v1-0001-SQL-JSON-Disallow-incompatible-values-in-ON-ERROR.patch)
and the above points, I have made some changes, attached.
it will make json_value, json_query not allow {true | false | unknown
} on error, {true | false | unknown } on empty.
json_table error message deal the same way as
b4fad46b6bc8a9bf46ff689bcb1bd4edf8f267af

Here is an updated patch that I think takes care of these points.

BTW,
i found one JSON_TABLE document deficiency
[ { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT
expression } ON EMPTY ]
[ { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT
expression } ON ERROR ]

it should be

[ { ERROR | NULL | EMPTY { [ARRAY] | OBJECT } | DEFAULT
expression } ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ARRAY] | OBJECT } | DEFAULT
expression } ON ERROR ]

You're right. Fixed.

Also, I noticed that the grammar allows ON EMPTY in JSON_TABLE EXISTS
columns which is meaningless because JSON_EXISTS() doesn't have a
corresponding ON EMPTY clause. Fixed grammar to prevent that in the
attached 0002.

I've pushed this for now to close out the open item.

I know there's some documentation improvement work left to do [1]/messages/by-id/CAKFQuwbYBvUZasGj_ZnfXhC2kk4AT=epwGkNd2=RMMVXkfTNMQ@mail.gmail.com,
which I'll try to find some time for next week.

--
Thanks, Amit Langote

[1]: /messages/by-id/CAKFQuwbYBvUZasGj_ZnfXhC2kk4AT=epwGkNd2=RMMVXkfTNMQ@mail.gmail.com

#15David G. Johnston
david.g.johnston@gmail.com
In reply to: David G. Johnston (#9)
Re: SQL/JSON query functions context_item doc entry and type requirement

On Thursday, June 20, 2024, David G. Johnston <david.g.johnston@gmail.com>
wrote:

As for table 9.16.3 - it is unwieldy already. Lets try and make the

core syntax shorter, not longer. We already have precedence in the
subsequent json_table section - give each major clause item a name then
below the table define the syntax and meaning for those names. Unlike in
that section - which probably should be modified too - context_item should
have its own description line.

I had posted a patch a little while ago at [1] to render the syntax a
bit differently with each function getting its own syntax synopsis.
Resending it here; have addressed Jian He's comments.

--

I was thinking more like:

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c324906b22..b9d157663a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18692,8 +18692,10 @@ $.* ? (@ like_regex "^\\d+$")
<entry role="func_table_entry"><para role="func_signature">
<indexterm><primary>json_exists</primary></indexterm>
<function>json_exists</function> (
-        <replaceable>context_item</replaceable>,
<replaceable>path_expression</replaceable> <optional>
<literal>PASSING</literal> { <replaceable>value</replaceable>
<literal>AS</literal> <replaceable>varname</replaceable> } <optional>,
...</optional></optional>
-        <optional> { <literal>TRUE</literal> | <literal>FALSE</literal>
|<literal> UNKNOWN</literal> | <literal>ERROR</literal> } <literal>ON
ERROR</literal> </optional>)
+        <replaceable>context_item</replaceable>,
+        <replaceable>path_expression</replaceable>
+        <optional>variable_definitions</optional>
+        <optional>on_error_boolean</optional>)
</para> empty semantics should be the same as json_query.

The full first draft patch for this is here:

/messages/by-id/CAKFQuwZNxNHuPk44zDF7z8qZec1Aof10aA9tWvBU5CMhEKEd8A@mail.gmail.com

David J.