Patch: Improve Boolean Predicate JSON Path Docs

Started by David E. Wheelerabout 2 years ago37 messages
#1David E. Wheeler
david@justatheory.com
1 attachment(s)

Hackers,

Following up from a suggestion from Tom Lane[1]/messages/by-id/1229727.1680535592@sss.pgh.pa.us to improve the documentation of boolean predicate JSON path expressions, please find enclosed a draft patch to do so. It does three things:

1. Converts all of the example path queries to use jsonb_path_query() and show the results, to make it clearer what the behaviors are.

2. Replaces the list of deviations from the standards with a new subsection, with each deviation in its own sub-subsection. The regex section is unchanged, but I’ve greatly expanded the boolean expression JSON path section with examples comparing standard filter expressions and nonstandard boolean predicates. I’ve also added an exhortation not use boolean expressions with @? or standard path expressions with @@.

3. While converting the modes section to use jsonb_path_query() and show the results, I also added an example of strict mode returning an error.

Follow-ups I’d like to make:

1. Expand the modes section to show how the types of results can vary depending on the mode, thanks to the flattening. Examples:

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a ?(@[*] > 2)');
jsonb_path_query
------------------
3
4
5
(3 rows)

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.a ?(@[*] > 2)');
jsonb_path_query
------------------
[1, 2, 3, 4, 5]

2. Improve the descriptions and examples for @?/jsonb_path_exists() and @@/jsonb_path_match().

Best,

David

[1]: /messages/by-id/1229727.1680535592@sss.pgh.pa.us

Attachments:

jsonpath-pred-docs.patchapplication/octet-stream; name=jsonpath-pred-docs.patch; x-unix-mode=0644Download
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index affd1254bb..295f8ca5c9 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17205,7 +17205,7 @@ array w/o UK? | t
    For example, suppose you have some JSON data from a GPS tracker that you
    would like to parse, such as:
 <programlisting>
-{
+ \set json '{
   "track": {
     "segments": [
       {
@@ -17220,7 +17220,7 @@ array w/o UK? | t
       }
     ]
   }
-}
+}'
 </programlisting>
   </para>
 
@@ -17229,7 +17229,10 @@ array w/o UK? | t
    <literal>.<replaceable>key</replaceable></literal> accessor
    operator to descend through surrounding JSON objects:
 <programlisting>
-$.track.segments
+select jsonb_path_query(:'json'::jsonb, '$.track.segments');
+                                                                         jsonb_path_query
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
 </programlisting>
   </para>
 
@@ -17239,7 +17242,11 @@ $.track.segments
    the following path will return the location coordinates for all
    the available track segments:
 <programlisting>
-$.track.segments[*].location
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*].location');
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
 </programlisting>
   </para>
 
@@ -17248,7 +17255,10 @@ $.track.segments[*].location
    specify the corresponding subscript in the <literal>[]</literal>
    accessor operator. Recall that JSON array indexes are 0-relative:
 <programlisting>
-$.track.segments[0].location
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments[0].location');
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
 </programlisting>
   </para>
 
@@ -17259,7 +17269,10 @@ $.track.segments[0].location
    Each method name must be preceded by a dot. For example,
    you can get the size of an array:
 <programlisting>
-$.track.segments.size()
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments.size()');
+ jsonb_path_query
+------------------
+ 2
 </programlisting>
    More examples of using <type>jsonpath</type> operators
    and methods within path expressions appear below in
@@ -17302,7 +17315,10 @@ $.track.segments.size()
    For example, suppose you would like to retrieve all heart rate values higher
    than 130. You can achieve this using the following expression:
 <programlisting>
-$.track.segments[*].HR ? (@ &gt; 130)
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*].HR ? (@ &gt; 130)');
+ jsonb_path_query
+------------------
+ 135
 </programlisting>
   </para>
 
@@ -17312,7 +17328,10 @@ $.track.segments[*].HR ? (@ &gt; 130)
    filter expression is applied to the previous step, and the path used
    in the condition is different:
 <programlisting>
-$.track.segments[*] ? (@.HR &gt; 130)."start time"
+ select jsonb_path_query(:'json'::jsonb, '$.track.segments[*] ? (@.HR &gt; 130)."start time"');
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
 </programlisting>
   </para>
 
@@ -17321,7 +17340,10 @@ $.track.segments[*] ? (@.HR &gt; 130)."start time"
    example, the following expression selects start times of all segments that
    contain locations with relevant coordinates and high heart rate values:
 <programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"');
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
 </programlisting>
   </para>
 
@@ -17330,46 +17352,81 @@ $.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
    The following example first filters all segments by location, and then
    returns high heart rate values for these segments, if available:
 <programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)
+select jsonb_path_query(:'json'::jsonb, $.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)');
+ jsonb_path_query
+------------------
+ 135
 </programlisting>
   </para>
 
   <para>
    You can also nest filter expressions within each other:
 <programlisting>
-$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()
+select jsonb_path_query(:'json'::jsonb, $.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()');
+ jsonb_path_query
+------------------
+ 2
 </programlisting>
    This expression returns the size of the track if it contains any
    segments with high heart rate values, or an empty sequence otherwise.
   </para>
 
-  <para>
-   <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
-   language has the following deviations from the SQL/JSON standard:
-  </para>
+  <sect3 id="devations-from-the-standard">
+  <title>Devaiations from the SQL Standard</title>
+   <para>
+    <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
+    language has the following deviations from the SQL/JSON standard:
+   </para>
 
-  <itemizedlist>
-   <listitem>
+   <sect4 id="boolean-predicate-path-expressions">
+   <title>Boolean Predicate Path Expressions</title>
     <para>
-     A path expression can be a Boolean predicate, although the SQL/JSON
-     standard allows predicates only in filters.  This is necessary for
-     implementation of the <literal>@@</literal> operator. For example,
-     the following <type>jsonpath</type> expression is valid in
-     <productname>PostgreSQL</productname>:
+     As an extension to the SQL standard, a <productname>PostgreSQL</productname>
+     path expression can be a Boolean predicate, whereas the SQL standard allows
+     predicates only in filters. Where SQL standard path expressions return the
+     relevant contents of the queried JSON value, predicate path expressions
+     return the three-value three-valued result of the predicate:
+     <literal>true</literal>, <literal>false</literal>, or
+     <literal>unknown</literal>. Compare this filter <type>jsonpath</type>
+     exression:
 <programlisting>
-$.track.segments[*].HR &lt; 70
+select jsonb_path_query(:'json'::jsonb, '$.track.segments ?(@[*].HR &gt; 130)');
+                                jsonb_path_query
+---------------------------------------------------------------------------------
+ {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}
 </programlisting>
-    </para>
-   </listitem>
+     To a predicate expression, which returns <literal>true</literal>
+<programlisting>
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*].HR &gt; 130');
+ jsonb_path_query
+------------------
+ true
+</programlisting>
+     </para>
 
-   <listitem>
-    <para>
-     There are minor differences in the interpretation of regular
-     expression patterns used in <literal>like_regex</literal> filters, as
-     described in <xref linkend="jsonpath-regular-expressions"/>.
-    </para>
-   </listitem>
-  </itemizedlist>
+     <para>
+      Predicate-only path expressions are necessary for implementation of the
+      <literal>@@</literal> operator (and the
+      <function>jsonb_path_match</function> function), and should not be used
+      with the <literal>@?</literal> operator (or
+      <function>jsonb_path_exists</function> function).
+     </para>
+
+     <para>
+      Conversely, non-predicate <type>jsonpath</type> expressions should not be
+      used with the <literal>@@</literal> operator (or the
+      <function>jsonb_path_match</function> function).
+     </para>
+    </sect4>
+
+    <sect4 id="jsonpath-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>
 
    <sect3 id="strict-and-lax-modes">
    <title>Strict and Lax Modes</title>
@@ -17431,18 +17488,30 @@ $.track.segments[*].HR &lt; 70
     abstract from the fact that it stores an array of segments
     when using the lax mode:
 <programlisting>
-lax $.track.segments.location
+ select jsonb_path_query(:'json'::jsonb, 'lax $.track.segments.location');
+ jsonb_path_query  
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
 </programlisting>
    </para>
 
    <para>
-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of
     the queried JSON document to return an SQL/JSON item, so using this
-    path expression will cause an error. To get the same result as in
-    the lax mode, you have to explicitly unwrap the
+    path expression will cause an error:
+<programlisting>
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments.location');
+ERROR:  jsonpath member accessor can only be applied to an object
+</programlisting>    
+    To get the same result as in the lax mode, you have to explicitly unwrap the
     <literal>segments</literal> array:
 <programlisting>
-strict $.track.segments[*].location
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments[*].location');
+ jsonb_path_query  
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
 </programlisting>
    </para>
 
@@ -17451,7 +17520,13 @@ strict $.track.segments[*].location
     when using the lax mode. For instance, the following query selects every
     <literal>HR</literal> value twice:
 <programlisting>
-lax $.**.HR
+select jsonb_path_query(:'json'::jsonb, 'lax $.**.HR');
+ jsonb_path_query 
+------------------
+ 73
+ 135
+ 73
+ 135
 </programlisting>
     This happens because the <literal>.**</literal> accessor selects both
     the <literal>segments</literal> array and each of its elements, while
@@ -17460,7 +17535,11 @@ lax $.**.HR
     the <literal>.**</literal> accessor only in the strict mode. The
     following query selects each <literal>HR</literal> value just once:
 <programlisting>
-strict $.**.HR
+select jsonb_path_query(:'json'::jsonb, 'strict $.**.HR');
+ jsonb_path_query 
+------------------
+ 73
+ 135
 </programlisting>
    </para>
 
#2David E. Wheeler
david@justatheory.com
In reply to: David E. Wheeler (#1)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 14, 2023, at 16:40, David E. Wheeler <david@justatheory.com> wrote:

Following up from a suggestion from Tom Lane[1] to improve the documentation of boolean predicate JSON path expressions, please find enclosed a draft patch to do so.

And now I see I can’t spell “Deviations”. Will fix along with any other requested revisions. GitHub diff here if you’re into that sort of thing:

https://github.com/postgres/postgres/compare/master...theory:postgres:jsonpath-pred-docs

Best,

David

#3Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#1)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-14 22:40 +0200, David E. Wheeler write:

Following up from a suggestion from Tom Lane[1] to improve the
documentation of boolean predicate JSON path expressions, please find
enclosed a draft patch to do so.

Thanks for putting this together. See my review at the end.

It does three things:

1. Converts all of the example path queries to use jsonb_path_query()
and show the results, to make it clearer what the behaviors are.

Nice. This really does help to make some sense of it. I checked all
queries and they do work out except for two queries where the path
expression string is not properly quoted (but the intended output is
still correct).

2. Replaces the list of deviations from the standards with a new
subsection, with each deviation in its own sub-subsection. The regex
section is unchanged, but I’ve greatly expanded the boolean expression
JSON path section with examples comparing standard filter expressions
and nonstandard boolean predicates. I’ve also added an exhortation not
use boolean expressions with @? or standard path expressions with @@.

LGTM.

3. While converting the modes section to use jsonb_path_query() and
show the results, I also added an example of strict mode returning an
error.

Follow-ups I’d like to make:

1. Expand the modes section to show how the types of results can vary
depending on the mode, thanks to the flattening. Examples:

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a ?(@[*] > 2)');
jsonb_path_query
------------------
3
4
5
(3 rows)

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.a ?(@[*] > 2)');
jsonb_path_query
------------------
[1, 2, 3, 4, 5]

2. Improve the descriptions and examples for @?/jsonb_path_exists()
and @@/jsonb_path_match().

+1

[1] /messages/by-id/1229727.1680535592@sss.pgh.pa.us

My review:

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index affd1254bb..295f8ca5c9 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17205,7 +17205,7 @@ array w/o UK? | t
For example, suppose you have some JSON data from a GPS tracker that you
would like to parse, such as:
<programlisting>
-{
+ \set json '{

Perhaps make it explicit that the reader must run this in psql in order
to use \set and :'json' in the ensuing samples? Some of the existing
examples already use psql output but they do not rely on any psql
features.

"track": {
"segments": [
{
@@ -17220,7 +17220,7 @@ array w/o UK? | t
}
]
}
-}
+}'
</programlisting>
</para>

@@ -17229,7 +17229,10 @@ array w/o UK? | t
<literal>.<replaceable>key</replaceable></literal> accessor
operator to descend through surrounding JSON objects:
<programlisting>
-$.track.segments
+select jsonb_path_query(:'json'::jsonb, '$.track.segments');
+                                                                         jsonb_path_query
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
</programlisting>

This should use <screen>, <userinput>, and <computeroutput> if it shows
a psql session, e.g.:

<screen>
<userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
<computeroutput>
jsonb_path_query
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
</computeroutput>
</screen>

Also the cast to jsonb is not necessary and only adds clutter IMO.

</para>

@@ -17239,7 +17242,11 @@ $.track.segments
the following path will return the location coordinates for all
the available track segments:
<programlisting>
-$.track.segments[*].location
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*].location');
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
</programlisting>
</para>
@@ -17248,7 +17255,10 @@ $.track.segments[*].location
specify the corresponding subscript in the <literal>[]</literal>
accessor operator. Recall that JSON array indexes are 0-relative:
<programlisting>
-$.track.segments[0].location
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments[0].location');
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
</programlisting>
</para>
@@ -17259,7 +17269,10 @@ $.track.segments[0].location
Each method name must be preceded by a dot. For example,
you can get the size of an array:
<programlisting>
-$.track.segments.size()
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments.size()');
+ jsonb_path_query
+------------------
+ 2
</programlisting>
More examples of using <type>jsonpath</type> operators
and methods within path expressions appear below in
@@ -17302,7 +17315,10 @@ $.track.segments.size()
For example, suppose you would like to retrieve all heart rate values higher
than 130. You can achieve this using the following expression:
<programlisting>
-$.track.segments[*].HR ? (@ &gt; 130)
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*].HR ? (@ &gt; 130)');
+ jsonb_path_query
+------------------
+ 135
</programlisting>
</para>
@@ -17312,7 +17328,10 @@ $.track.segments[*].HR ? (@ &gt; 130)
filter expression is applied to the previous step, and the path used
in the condition is different:
<programlisting>
-$.track.segments[*] ? (@.HR &gt; 130)."start time"
+ select jsonb_path_query(:'json'::jsonb, '$.track.segments[*] ? (@.HR &gt; 130)."start time"');
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
</programlisting>
</para>
@@ -17321,7 +17340,10 @@ $.track.segments[*] ? (@.HR &gt; 130)."start time"
example, the following expression selects start times of all segments that
contain locations with relevant coordinates and high heart rate values:
<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"');
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
</programlisting>
</para>
@@ -17330,46 +17352,81 @@ $.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
The following example first filters all segments by location, and then
returns high heart rate values for these segments, if available:
<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)
+select jsonb_path_query(:'json'::jsonb, $.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)');

The opening quote is missing from the jsonpath literal.

+ jsonb_path_query
+------------------
+ 135
</programlisting>
</para>
<para>
You can also nest filter expressions within each other:
<programlisting>
-$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()
+select jsonb_path_query(:'json'::jsonb, $.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()');

Missing opening quote here as well.

+ jsonb_path_query
+------------------
+ 2
</programlisting>
This expression returns the size of the track if it contains any
segments with high heart rate values, or an empty sequence otherwise.
</para>

-  <para>
-   <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
-   language has the following deviations from the SQL/JSON standard:
-  </para>
+  <sect3 id="devations-from-the-standard">
+  <title>Devaiations from the SQL Standard</title>

Typo in "deviations" (section ID and title).

+   <para>
+    <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
+    language has the following deviations from the SQL/JSON standard:

The sentence should and in a period when this para is no longer followed
by an item list.

+ </para>

-  <itemizedlist>
-   <listitem>
+   <sect4 id="boolean-predicate-path-expressions">
+   <title>Boolean Predicate Path Expressions</title>
<para>
-     A path expression can be a Boolean predicate, although the SQL/JSON
-     standard allows predicates only in filters.  This is necessary for
-     implementation of the <literal>@@</literal> operator. For example,
-     the following <type>jsonpath</type> expression is valid in
-     <productname>PostgreSQL</productname>:
+     As an extension to the SQL standard, a <productname>PostgreSQL</productname>
+     path expression can be a Boolean predicate, whereas the SQL standard allows
+     predicates only in filters. Where SQL standard path expressions return the
+     relevant contents of the queried JSON value, predicate path expressions
+     return the three-value three-valued result of the predicate:

Redundant "three-value" before "three-valued result".

+     <literal>true</literal>, <literal>false</literal>, or
+     <literal>unknown</literal>. Compare this filter <type>jsonpath</type>
+     exression:
<programlisting>
-$.track.segments[*].HR &lt; 70
+select jsonb_path_query(:'json'::jsonb, '$.track.segments ?(@[*].HR &gt; 130)');
+                                jsonb_path_query
+---------------------------------------------------------------------------------
+ {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}
</programlisting>
-    </para>
-   </listitem>
+     To a predicate expression, which returns <literal>true</literal>
+<programlisting>
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*].HR &gt; 130');
+ jsonb_path_query
+------------------
+ true
+</programlisting>
+     </para>
-   <listitem>
-    <para>
-     There are minor differences in the interpretation of regular
-     expression patterns used in <literal>like_regex</literal> filters, as
-     described in <xref linkend="jsonpath-regular-expressions"/>.
-    </para>
-   </listitem>
-  </itemizedlist>
+     <para>
+      Predicate-only path expressions are necessary for implementation of the
+      <literal>@@</literal> operator (and the
+      <function>jsonb_path_match</function> function), and should not be used
+      with the <literal>@?</literal> operator (or
+      <function>jsonb_path_exists</function> function).
+     </para>
+
+     <para>
+      Conversely, non-predicate <type>jsonpath</type> expressions should not be
+      used with the <literal>@@</literal> operator (or the
+      <function>jsonb_path_match</function> function).
+     </para>
+    </sect4>

Both paras should be wrapped in a single <note> so that they stand out
from the rest of the text. Maybe even <warning>, but <note> is already
used on this page for things that I'd consider warnings.

+    <sect4 id="jsonpath-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>

<sect3 id="devations-from-the-standard"> should be closed here,
otherwise the docs won't build. This can be checked with
`make -C doc/src/sgml check`.

<sect3 id="strict-and-lax-modes">
<title>Strict and Lax Modes</title>
@@ -17431,18 +17488,30 @@ $.track.segments[*].HR &lt; 70
abstract from the fact that it stores an array of segments
when using the lax mode:
<programlisting>
-lax $.track.segments.location
+ select jsonb_path_query(:'json'::jsonb, 'lax $.track.segments.location');
+ jsonb_path_query  

`git diff --check` shows a couple of lines with trailing whitespace
(mostly psql output).

+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
</programlisting>
</para>
<para>
-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of
the queried JSON document to return an SQL/JSON item, so using this
-    path expression will cause an error. To get the same result as in
-    the lax mode, you have to explicitly unwrap the
+    path expression will cause an error:
+<programlisting>
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments.location');
+ERROR:  jsonpath member accessor can only be applied to an object
+</programlisting>    
+    To get the same result as in the lax mode, you have to explicitly unwrap the
<literal>segments</literal> array:
<programlisting>
-strict $.track.segments[*].location
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments[*].location');
+ jsonb_path_query  
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
</programlisting>
</para>
@@ -17451,7 +17520,13 @@ strict $.track.segments[*].location
when using the lax mode. For instance, the following query selects every
<literal>HR</literal> value twice:
<programlisting>
-lax $.**.HR
+select jsonb_path_query(:'json'::jsonb, 'lax $.**.HR');
+ jsonb_path_query 
+------------------
+ 73
+ 135
+ 73
+ 135
</programlisting>
This happens because the <literal>.**</literal> accessor selects both
the <literal>segments</literal> array and each of its elements, while
@@ -17460,7 +17535,11 @@ lax $.**.HR
the <literal>.**</literal> accessor only in the strict mode. The
following query selects each <literal>HR</literal> value just once:
<programlisting>
-strict $.**.HR
+select jsonb_path_query(:'json'::jsonb, 'strict $.**.HR');
+ jsonb_path_query 
+------------------
+ 73
+ 135
</programlisting>
</para>

--
Erik

#4David E. Wheeler
david@justatheory.com
In reply to: Erik Wienhold (#3)
1 attachment(s)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 14, 2023, at 19:51, Erik Wienhold <ewie@ewie.name> wrote:

Thanks for putting this together. See my review at the end.

Appreciate the speedy review!

Nice. This really does help to make some sense of it. I checked all
queries and they do work out except for two queries where the path
expression string is not properly quoted (but the intended output is
still correct).

🤦🏻‍♂️

Follow-ups I’d like to make:

1. Expand the modes section to show how the types of results can vary
depending on the mode, thanks to the flattening. Examples:

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a ?(@[*] > 2)');
jsonb_path_query
------------------
3
4
5
(3 rows)

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.a ?(@[*] > 2)');
jsonb_path_query
------------------
[1, 2, 3, 4, 5]

2. Improve the descriptions and examples for @?/jsonb_path_exists()
and @@/jsonb_path_match().

+1

I planned to submit these changes in a separate patch, based on Tom Lane’s suggestion[1]/messages/by-id/1229727.1680535592@sss.pgh.pa.us. Would it be preferred to add them to this patch?

Perhaps make it explicit that the reader must run this in psql in order
to use \set and :'json' in the ensuing samples? Some of the existing
examples already use psql output but they do not rely on any psql
features.

Good call, done.

This should use <screen>, <userinput>, and <computeroutput> if it shows
a psql session, e.g.:

<screen>
<userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
<computeroutput>
jsonb_path_query
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
</computeroutput>
</screen>

I pokwds around, and it appears the computeroutput bit is used for function output. So I followed the precedent in queries.sgml[2]https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-JOIN and omitted the computeroutput tags but added prompt, e.g.,

<screen>
<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.**.HR');</userinput>
jsonb_path_query
------------------
73
135
</screen>

Also the cast to jsonb is not necessary and only adds clutter IMO.

Right, removed them all in function calls.

+     <para>
+      Predicate-only path expressions are necessary for implementation of the
+      <literal>@@</literal> operator (and the
+      <function>jsonb_path_match</function> function), and should not be used
+      with the <literal>@?</literal> operator (or
+      <function>jsonb_path_exists</function> function).
+     </para>
+
+     <para>
+      Conversely, non-predicate <type>jsonpath</type> expressions should not be
+      used with the <literal>@@</literal> operator (or the
+      <function>jsonb_path_match</function> function).
+     </para>
+    </sect4>

Both paras should be wrapped in a single <note> so that they stand out
from the rest of the text. Maybe even <warning>, but <note> is already
used on this page for things that I'd consider warnings.

Agreed. Would be good if we could teach these functions and operators to reject path expressions they don’t support.

+    <sect4 id="jsonpath-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>

<sect3 id="devations-from-the-standard"> should be closed here,
otherwise the docs won't build. This can be checked with
`make -C doc/src/sgml check`.

Thanks. That produces a bunch of warnings for postgres.sgml and legal.sgml (and a failure to load the docbook DTD), but func.sgml is clean now.

`git diff --check` shows a couple of lines with trailing whitespace
(mostly psql output).

I must’ve cleaned those after I sent the patch, good now. Updated patch attached, this time created by `git format-patch -v2`.

Best,

David

[1]: /messages/by-id/1229727.1680535592@sss.pgh.pa.us
[2]: https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-JOIN

Attachments:

v2-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/octet-stream; name=v2-0001-Improve-boolean-predicate-JSON-Path-docs.patch; x-unix-mode=0644Download
From d0ededc16eee7f879eefe4f726921bee8644b51b Mon Sep 17 00:00:00 2001
From: "David E. Wheeler" <david@justatheory.com>
Date: Sat, 14 Oct 2023 16:42:58 -0400
Subject: [PATCH v2] Improve boolean predicate JSON Path docs

Following up from a suggestion from Tom Lane[1] to improve the
documentation of boolean predicate JSON path expressions, please find
enclosed a draft patch to do so. It does three things:

1. Converts all of the example path queries to use `jsonb_path_query()`
   and show the results, to make it clearer what the behaviors are.

2. Replaces the list of deviations from the standards with a new
   subsection, with each deviation in its own sub-subsection. The regex
   section is unchanged, but I've greatly expanded the boolean
   expression JSON path section with examples comparing standard filter
   expressions and nonstandard boolean predicates. I've also added an
   exhortation not use boolean expressions with @? or standard path
   expressions with @@.

3. While converting the modes section to use `jsonb_path_query()` and
   show the results, I also added an example of strict mode returning an
   error.
---
 doc/src/sgml/func.sgml | 224 ++++++++++++++++++++++++++++-------------
 1 file changed, 154 insertions(+), 70 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index affd1254bb..a2bfc12312 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17203,9 +17203,12 @@ array w/o UK? | t
 
   <para>
    For example, suppose you have some JSON data from a GPS tracker that you
-   would like to parse, such as:
+   would like to parse, such as this JSON, set up as a
+   <link linkend="app-psql-meta-command-set"><application>psql</application>
+   <command>\set</command> variable</link> for use as <literal>:'json'</literal>
+   in the examples below:
 <programlisting>
-{
+ \set json '{
   "track": {
     "segments": [
       {
@@ -17220,7 +17223,7 @@ array w/o UK? | t
       }
     ]
   }
-}
+}'
 </programlisting>
   </para>
 
@@ -17228,9 +17231,13 @@ array w/o UK? | t
    To retrieve the available track segments, you need to use the
    <literal>.<replaceable>key</replaceable></literal> accessor
    operator to descend through surrounding JSON objects:
-<programlisting>
-$.track.segments
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
+select jsonb_path_query(:'json', '$.track.segments');
+                                                                         jsonb_path_query
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
+</screen>
   </para>
 
   <para>
@@ -17238,18 +17245,25 @@ $.track.segments
    <literal>[*]</literal> operator. For example,
    the following path will return the location coordinates for all
    the available track segments:
-<programlisting>
-$.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
   </para>
 
   <para>
    To return the coordinates of the first segment only, you can
    specify the corresponding subscript in the <literal>[]</literal>
    accessor operator. Recall that JSON array indexes are 0-relative:
-<programlisting>
-$.track.segments[0].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[0].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+</screen>
   </para>
 
   <para>
@@ -17258,9 +17272,12 @@ $.track.segments[0].location
    listed in <xref linkend="functions-sqljson-path-operators"/>.
    Each method name must be preceded by a dot. For example,
    you can get the size of an array:
-<programlisting>
-$.track.segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    More examples of using <type>jsonpath</type> operators
    and methods within path expressions appear below in
    <xref linkend="functions-sqljson-path-operators"/>.
@@ -17301,9 +17318,12 @@ $.track.segments.size()
   <para>
    For example, suppose you would like to retrieve all heart rate values higher
    than 130. You can achieve this using the following expression:
-<programlisting>
-$.track.segments[*].HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
@@ -17311,65 +17331,108 @@ $.track.segments[*].HR ? (@ &gt; 130)
    filter out irrelevant segments before returning the start times, so the
    filter expression is applied to the previous step, and the path used
    in the condition is different:
-<programlisting>
-$.track.segments[*] ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    You can use several filter expressions in sequence, if required. For
    example, the following expression selects start times of all segments that
    contain locations with relevant coordinates and high heart rate values:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    Using filter expressions at different nesting levels is also allowed.
    The following example first filters all segments by location, and then
    returns high heart rate values for these segments, if available:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
    You can also nest filter expressions within each other:
-<programlisting>
-$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    This expression returns the size of the track if it contains any
    segments with high heart rate values, or an empty sequence otherwise.
   </para>
 
-  <para>
-   <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
-   language has the following deviations from the SQL/JSON standard:
-  </para>
+  <sect3 id="deviations-from-the-standard">
+  <title>Deviations from the SQL Standard</title>
+   <para>
+    <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
+    language has the following deviations from the SQL/JSON standard.
+   </para>
 
-  <itemizedlist>
-   <listitem>
+   <sect4 id="boolean-predicate-path-expressions">
+   <title>Boolean Predicate Path Expressions</title>
     <para>
-     A path expression can be a Boolean predicate, although the SQL/JSON
-     standard allows predicates only in filters.  This is necessary for
-     implementation of the <literal>@@</literal> operator. For example,
-     the following <type>jsonpath</type> expression is valid in
-     <productname>PostgreSQL</productname>:
-<programlisting>
-$.track.segments[*].HR &lt; 70
-</programlisting>
-    </para>
-   </listitem>
+     As an extension to the SQL standard, a <productname>PostgreSQL</productname>
+     path expression can be a Boolean predicate, whereas the SQL standard allows
+     predicates only in filters. Where SQL standard path expressions return the
+     relevant contents of the queried JSON value, predicate path expressions
+     return the three-valued result of the predicate: <literal>true</literal>,
+     <literal>false</literal>, or <literal>unknown</literal>. Compare this
+     filter <type>jsonpath</type> expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments ?(@[*].HR &gt; 130)');</userinput>
+                                jsonb_path_query
+---------------------------------------------------------------------------------
+ {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}
+</screen>
+     To a predicate expression, which returns <literal>true</literal>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR &gt; 130');</userinput>
+ jsonb_path_query
+------------------
+ true
+</screen>
+     </para>
 
-   <listitem>
-    <para>
-     There are minor differences in the interpretation of regular
-     expression patterns used in <literal>like_regex</literal> filters, as
-     described in <xref linkend="jsonpath-regular-expressions"/>.
-    </para>
-   </listitem>
-  </itemizedlist>
+     <note>
+      <para>
+       Predicate-only path expressions are necessary for implementation of the
+       <literal>@@</literal> operator (and the
+       <function>jsonb_path_match</function> function), and should not be used
+       with the <literal>@?</literal> operator (or
+       <function>jsonb_path_exists</function> function).
+      </para>
+
+      <para>
+       Conversely, non-predicate <type>jsonpath</type> expressions should not be
+       used with the <literal>@@</literal> operator (or the
+       <function>jsonb_path_match</function> function).
+      </para>
+     </note>
+    </sect4>
+
+    <sect4 id="jsonpath-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>
+   </sect3>
 
    <sect3 id="strict-and-lax-modes">
    <title>Strict and Lax Modes</title>
@@ -17430,40 +17493,61 @@ $.track.segments[*].HR &lt; 70
     For example, when querying the GPS data listed above, you can
     abstract from the fact that it stores an array of segments
     when using the lax mode:
-<programlisting>
-lax $.track.segments.location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments.location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
    <para>
-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of
     the queried JSON document to return an SQL/JSON item, so using this
-    path expression will cause an error. To get the same result as in
-    the lax mode, you have to explicitly unwrap the
+    path expression will cause an error:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.location');</userinput>
+ERROR:  jsonpath member accessor can only be applied to an object
+</screen>
+    To get the same result as in the lax mode, you have to explicitly unwrap the
     <literal>segments</literal> array:
-<programlisting>
-strict $.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
    <para>
     The <literal>.**</literal> accessor can lead to surprising results
     when using the lax mode. For instance, the following query selects every
     <literal>HR</literal> value twice:
-<programlisting>
-lax $.**.HR
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.**.HR');</userinput>
+ jsonb_path_query
+------------------
+ 73
+ 135
+ 73
+ 135
+</screen>
     This happens because the <literal>.**</literal> accessor selects both
     the <literal>segments</literal> array and each of its elements, while
     the <literal>.HR</literal> accessor automatically unwraps arrays when
     using the lax mode. To avoid surprising results, we recommend using
     the <literal>.**</literal> accessor only in the strict mode. The
     following query selects each <literal>HR</literal> value just once:
-<programlisting>
-strict $.**.HR
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.**.HR');</userinput>
+ jsonb_path_query
+------------------
+ 73
+ 135
+</screen>
    </para>
-
    </sect3>
 
    <sect3 id="functions-sqljson-path-operators">
-- 
2.42.0

#5Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#4)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-16 01:04 +0200, David E. Wheeler write:

On Oct 14, 2023, at 19:51, Erik Wienhold <ewie@ewie.name> wrote:

Thanks for putting this together. See my review at the end.

Appreciate the speedy review!

You're welcome.

Follow-ups I’d like to make:

1. Expand the modes section to show how the types of results can vary
depending on the mode, thanks to the flattening. Examples:

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a ?(@[*] > 2)');
jsonb_path_query
------------------
3
4
5
(3 rows)

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.a ?(@[*] > 2)');
jsonb_path_query
------------------
[1, 2, 3, 4, 5]

2. Improve the descriptions and examples for @?/jsonb_path_exists()
and @@/jsonb_path_match().

+1

I planned to submit these changes in a separate patch, based on Tom
Lane’s suggestion[1]. Would it be preferred to add them to this patch?

Your call but I'm not against including it in this patch because it
already touches the modes section.

I pokwds around, and it appears the computeroutput bit is used for
function output. So I followed the precedent in queries.sgml[2] and
omitted the computeroutput tags but added prompt, e.g.,
<screen>
<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.**.HR');</userinput>
jsonb_path_query
------------------
73
135
</screen>

Okay, Not sure what the preferred style is but I saw <userinput> and
<computeroutput> used together in doc/src/sgml/ref/createuser.sgml.
But it's not applied consistently in the rest of the docs.

+     <para>
+      Predicate-only path expressions are necessary for implementation of the
+      <literal>@@</literal> operator (and the
+      <function>jsonb_path_match</function> function), and should not be used
+      with the <literal>@?</literal> operator (or
+      <function>jsonb_path_exists</function> function).
+     </para>
+
+     <para>
+      Conversely, non-predicate <type>jsonpath</type> expressions should not be
+      used with the <literal>@@</literal> operator (or the
+      <function>jsonb_path_match</function> function).
+     </para>
+    </sect4>

Both paras should be wrapped in a single <note> so that they stand out
from the rest of the text. Maybe even <warning>, but <note> is already
used on this page for things that I'd consider warnings.

Agreed. Would be good if we could teach these functions and operators
to reject path expressions they don’t support.

Right, you mentioned that idea in [1]/messages/by-id/BAF11F2D-5EDD-4DBB-87FA-4F35845029AE@justatheory.com (separate types). Not sure what
the best strategy here is but it's likely to break existing queries.
Maybe deprecating unsupported path expressions in the next major release
and changing that to an error in the major release after that.

This can be checked with `make -C doc/src/sgml check`.

Thanks. That produces a bunch of warnings for postgres.sgml and
legal.sgml (and a failure to load the docbook DTD), but func.sgml is
clean now.

Hmm... I get no warnings on 1f89b73c4e. Did you install all tools as
described in [2]https://www.postgresql.org/docs/current/docguide-toolsets.html? The DTD needs to be installed as well.

[1]: /messages/by-id/BAF11F2D-5EDD-4DBB-87FA-4F35845029AE@justatheory.com
[2]: https://www.postgresql.org/docs/current/docguide-toolsets.html

--
Erik

#6David E. Wheeler
david@justatheory.com
In reply to: Erik Wienhold (#5)
1 attachment(s)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 15, 2023, at 23:03, Erik Wienhold <ewie@ewie.name> wrote:

Your call but I'm not against including it in this patch because it
already touches the modes section.

Okay, added, let’s just put all our cards on the table. :-)

Agreed. Would be good if we could teach these functions and operators
to reject path expressions they don’t support.

Right, you mentioned that idea in [1] (separate types). Not sure what
the best strategy here is but it's likely to break existing queries.
Maybe deprecating unsupported path expressions in the next major release
and changing that to an error in the major release after that.

Well if the functions have a JsonPathItem struct, they can check its type attribute and reject those with a root type that’s a predicate in @? and reject it if it’s not a predicate in @@. Example of checking type here:

https://github.com/postgres/postgres/blob/54b208f90963cb8b48b9794a5392b2fae4b40a98/src/backend/utils/adt/jsonpath_exec.c#L622

This can be checked with `make -C doc/src/sgml check`.

Thanks. That produces a bunch of warnings for postgres.sgml and
legal.sgml (and a failure to load the docbook DTD), but func.sgml is
clean now.

Hmm... I get no warnings on 1f89b73c4e. Did you install all tools as
described in [2]? The DTD needs to be installed as well.

Thanks, got it down to one:

postgres.sgml:112: element sect4: validity error : Element sect4 content does not follow the DTD, expecting (sect4info? , (title , subtitle? , titleabbrev?) , (toc | lot | index | glossary | bibliography)* , (((calloutlist | glosslist | bibliolist | itemizedlist | orderedlist | segmentedlist | simplelist | variablelist | caution | important | note | tip | warning | literallayout | programlisting | programlistingco | screen | screenco | screenshot | synopsis | cmdsynopsis | funcsynopsis | classsynopsis | fieldsynopsis | constructorsynopsis | destructorsynopsis | methodsynopsis | formalpara | para | simpara | address | blockquote | graphic | graphicco | mediaobject | mediaobjectco | informalequation | informalexample | informalfigure | informaltable | equation | example | figure | table | msgset | procedure | sidebar | qandaset | task | anchor | bridgehead | remark | highlights | abstract | authorblurb | epigraph | indexterm | beginpage)+ , (refentry* | sect5* | simplesect*)) | refentry+ | sect5+ | simplesect+) , (toc | lot | index | glossary | bibliography)*), got (para para )
&func;

David

Attachments:

v3-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/applefile; name=v3-0001-Improve-boolean-predicate-JSON-Path-docs.patchDownload
	2
<6v3-0001-Improve-boolean-predicate-JSON-Path-docs.patch
#7Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#6)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-16 21:59 +0200, David E. Wheeler write:

On Oct 15, 2023, at 23:03, Erik Wienhold <ewie@ewie.name> wrote:

Your call but I'm not against including it in this patch because it
already touches the modes section.

Okay, added, let’s just put all our cards on the table. :-)

I'll have a look but the attached v3 is not a patch but some applefile.

Thanks, got it down to one:

postgres.sgml:112: element sect4: validity error : Element sect4 content does not follow the DTD, expecting (sect4info? , (title , subtitle? , titleabbrev?) , (toc | lot | index | glossary | bibliography)* , (((calloutlist | glosslist | bibliolist | itemizedlist | orderedlist | segmentedlist | simplelist | variablelist | caution | important | note | tip | warning | literallayout | programlisting | programlistingco | screen | screenco | screenshot | synopsis | cmdsynopsis | funcsynopsis | classsynopsis | fieldsynopsis | constructorsynopsis | destructorsynopsis | methodsynopsis | formalpara | para | simpara | address | blockquote | graphic | graphicco | mediaobject | mediaobjectco | informalequation | informalexample | informalfigure | informaltable | equation | example | figure | table | msgset | procedure | sidebar | qandaset | task | anchor | bridgehead | remark | highlights | abstract | authorblurb | epigraph | indexterm | beginpage)+ , (refentry* | sect5* | simplesect*)) | refentry+ | sect5+ | simplesect+) , (toc | lot | index | glossary | bibliography)*), got (para para )
&func;

One of the added <sect4> is invalid by the looks of it. Maybe <title>
is missing because it says "got (para para )" at the end.

--
Erik

#8David E. Wheeler
david@justatheory.com
In reply to: Erik Wienhold (#7)
1 attachment(s)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 16, 2023, at 18:07, Erik Wienhold <ewie@ewie.name> wrote:

Okay, added, let’s just put all our cards on the table. :-)

I'll have a look but the attached v3 is not a patch but some applefile.

Weird, should be no different from previous attachments. I believe Apple Mail always uses application/octet-stream for attachments it doesn’t recognize, which includes .patch and .diff files, sadly.

One of the added <sect4> is invalid by the looks of it. Maybe <title>
is missing because it says "got (para para )" at the end.

Oh, I thought it would report issues from the files they were found in. You’re right, I forgot a title. Fixed in v4.

David

Attachments:

v4-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/octet-stream; name=v4-0001-Improve-boolean-predicate-JSON-Path-docs.patch; x-unix-mode=0644Download
From 9f1a635388ac19859f114539a628eb4e18300734 Mon Sep 17 00:00:00 2001
From: "David E. Wheeler" <david@justatheory.com>
Date: Sat, 14 Oct 2023 16:42:58 -0400
Subject: [PATCH v4] Improve boolean predicate JSON Path docs

Following up from a suggestion from Tom Lane[1] to improve the
documentation of boolean predicate JSON path expressions, please find
enclosed a draft patch to do so. It does three things:

1. Converts all of the example path queries to use `jsonb_path_query()`
   and show the results, to make it clearer what the behaviors are.

2. Replaces the list of deviations from the standards with a new
   subsection, with each deviation in its own sub-subsection. The regex
   section is unchanged, but I've greatly expanded the boolean
   expression JSON path section with examples comparing standard filter
   expressions and nonstandard boolean predicates. I've also added an
   exhortation not use boolean expressions with @? or standard path
   expressions with @@.

3. While converting the modes section to use `jsonb_path_query()` and
   show the results, I also added an example of strict mode returning an
   error and a section demonstrating the difference in query results
   when using lax vs. strict mode.

4. Removes the notes about "only the first value" in the `@@` and
   `jsonb_path_match()` docs, and noted that they support only predicate
   JSON path expressions. Also noted that `json_path_exists()` supports
   only SQL standard path expressions --- in other words, *not*
   predicate path queries.
---
 doc/src/sgml/func.sgml | 321 +++++++++++++++++++++++++++++------------
 1 file changed, 227 insertions(+), 94 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index affd1254bb..2c897f38a5 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15838,7 +15838,10 @@ table2-mapping
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Does JSON path return any item for the specified JSON value?
+        Does JSON path return any item for the specified JSON value? Use only
+        SQL-standard JSON path expressions, not not
+        <link linkend="boolean-predicate-path-expressions">predicate check
+        expressions.</link>
        </para>
        <para>
         <literal>'{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 2)'</literal>
@@ -15852,10 +15855,11 @@ table2-mapping
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Returns the result of a JSON path predicate check for the
-        specified JSON value.  Only the first item of the result is taken into
-        account.  If the result is not Boolean, then <literal>NULL</literal>
-        is returned.
+        Returns the result of a JSON path
+        <link linkend="boolean-predicate-path-expressions">predicate
+        check</link> for the specified JSON value. If the result is not Boolean,
+        then <literal>NULL</literal> is returned. Do not use with non-predicate
+        JSON path expressions.
        </para>
        <para>
         <literal>'{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2'</literal>
@@ -16926,7 +16930,9 @@ array w/o UK? | t
        </para>
        <para>
         Checks whether the JSON path returns any item for the specified JSON
-        value.
+        value. Use only SQL-standard JSON path expressions, not not
+        <link linkend="boolean-predicate-path-expressions">predicate check
+        expressions.</link>
         If the <parameter>vars</parameter> argument is specified, it must
         be a JSON object, and its fields provide named values to be
         substituted into the <type>jsonpath</type> expression.
@@ -16949,12 +16955,13 @@ array w/o UK? | t
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Returns the result of a JSON path predicate check for the specified
-        JSON value.  Only the first item of the result is taken into account.
-        If the result is not Boolean, then <literal>NULL</literal> is returned.
-        The optional <parameter>vars</parameter>
-        and <parameter>silent</parameter> arguments act the same as
-        for <function>jsonb_path_exists</function>.
+        Returns the result of a JSON path
+        <link linkend="boolean-predicate-path-expressions">predicate
+        check</link> for the specified JSON value. If the result is not Boolean,
+        then <literal>NULL</literal> is returned. Do not use with non-predicate
+        JSON path expressions. The optional <parameter>vars</parameter> and
+        <parameter>silent</parameter> arguments act the same as for
+        <function>jsonb_path_exists</function>.
        </para>
        <para>
         <literal>jsonb_path_match('{"a":[1,2,3,4,5]}', 'exists($.a[*] ? (@ >= $min &amp;&amp; @ &lt;= $max))', '{"min":2, "max":4}')</literal>
@@ -16972,8 +16979,12 @@ array w/o UK? | t
        </para>
        <para>
         Returns all JSON items returned by the JSON path for the specified
-        JSON value.
-        The optional <parameter>vars</parameter>
+        JSON value. For SQL-standard JSON path expressions it returns the JSON
+        values selected from <parameter>target</parameter>. For
+        <link linkend="boolean-predicate-path-expressions">predicate check</link>
+        Path expressions it returns the result of the predicate check:
+        <literal>true</literal>, <literal>false</literal>, or
+        <literal>null</literal>. The optional <parameter>vars</parameter>
         and <parameter>silent</parameter> arguments act the same as
         for <function>jsonb_path_exists</function>.
        </para>
@@ -17203,9 +17214,12 @@ array w/o UK? | t
 
   <para>
    For example, suppose you have some JSON data from a GPS tracker that you
-   would like to parse, such as:
+   would like to parse, such as this JSON, set up as a
+   <link linkend="app-psql-meta-command-set"><application>psql</application>
+   <command>\set</command> variable</link> for use as <literal>:'json'</literal>
+   in the examples below:
 <programlisting>
-{
+ \set json '{
   "track": {
     "segments": [
       {
@@ -17220,7 +17234,7 @@ array w/o UK? | t
       }
     ]
   }
-}
+}'
 </programlisting>
   </para>
 
@@ -17228,9 +17242,13 @@ array w/o UK? | t
    To retrieve the available track segments, you need to use the
    <literal>.<replaceable>key</replaceable></literal> accessor
    operator to descend through surrounding JSON objects:
-<programlisting>
-$.track.segments
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
+select jsonb_path_query(:'json', '$.track.segments');
+                                                                         jsonb_path_query
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
+</screen>
   </para>
 
   <para>
@@ -17238,18 +17256,25 @@ $.track.segments
    <literal>[*]</literal> operator. For example,
    the following path will return the location coordinates for all
    the available track segments:
-<programlisting>
-$.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
   </para>
 
   <para>
    To return the coordinates of the first segment only, you can
    specify the corresponding subscript in the <literal>[]</literal>
    accessor operator. Recall that JSON array indexes are 0-relative:
-<programlisting>
-$.track.segments[0].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[0].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+</screen>
   </para>
 
   <para>
@@ -17258,9 +17283,12 @@ $.track.segments[0].location
    listed in <xref linkend="functions-sqljson-path-operators"/>.
    Each method name must be preceded by a dot. For example,
    you can get the size of an array:
-<programlisting>
-$.track.segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    More examples of using <type>jsonpath</type> operators
    and methods within path expressions appear below in
    <xref linkend="functions-sqljson-path-operators"/>.
@@ -17301,9 +17329,12 @@ $.track.segments.size()
   <para>
    For example, suppose you would like to retrieve all heart rate values higher
    than 130. You can achieve this using the following expression:
-<programlisting>
-$.track.segments[*].HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
@@ -17311,65 +17342,108 @@ $.track.segments[*].HR ? (@ &gt; 130)
    filter out irrelevant segments before returning the start times, so the
    filter expression is applied to the previous step, and the path used
    in the condition is different:
-<programlisting>
-$.track.segments[*] ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    You can use several filter expressions in sequence, if required. For
    example, the following expression selects start times of all segments that
    contain locations with relevant coordinates and high heart rate values:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    Using filter expressions at different nesting levels is also allowed.
    The following example first filters all segments by location, and then
    returns high heart rate values for these segments, if available:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
    You can also nest filter expressions within each other:
-<programlisting>
-$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    This expression returns the size of the track if it contains any
    segments with high heart rate values, or an empty sequence otherwise.
   </para>
 
-  <para>
-   <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
-   language has the following deviations from the SQL/JSON standard:
-  </para>
+  <sect3 id="deviations-from-the-standard">
+  <title>Deviations from the SQL Standard</title>
+   <para>
+    <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
+    language has the following deviations from the SQL/JSON standard.
+   </para>
 
-  <itemizedlist>
-   <listitem>
+   <sect4 id="boolean-predicate-path-expressions">
+   <title>Boolean Predicate Path Expressions</title>
     <para>
-     A path expression can be a Boolean predicate, although the SQL/JSON
-     standard allows predicates only in filters.  This is necessary for
-     implementation of the <literal>@@</literal> operator. For example,
-     the following <type>jsonpath</type> expression is valid in
-     <productname>PostgreSQL</productname>:
-<programlisting>
-$.track.segments[*].HR &lt; 70
-</programlisting>
-    </para>
-   </listitem>
+     As an extension to the SQL standard, a <productname>PostgreSQL</productname>
+     path expression can be a Boolean predicate, whereas the SQL standard allows
+     predicates only in filters. Where SQL standard path expressions return the
+     relevant contents of the queried JSON value, predicate path expressions
+     return the three-valued result of the predicate: <literal>true</literal>,
+     <literal>false</literal>, or <literal>unknown</literal>. Compare this
+     filter <type>jsonpath</type> expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments ?(@[*].HR &gt; 130)');</userinput>
+                                jsonb_path_query
+---------------------------------------------------------------------------------
+ {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}
+</screen>
+     To a predicate expression, which returns <literal>true</literal>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR &gt; 130');</userinput>
+ jsonb_path_query
+------------------
+ true
+</screen>
+     </para>
 
-   <listitem>
-    <para>
-     There are minor differences in the interpretation of regular
-     expression patterns used in <literal>like_regex</literal> filters, as
-     described in <xref linkend="jsonpath-regular-expressions"/>.
-    </para>
-   </listitem>
-  </itemizedlist>
+     <note>
+      <para>
+       Predicate-only path expressions are necessary for implementation of the
+       <literal>@@</literal> operator (and the
+       <function>jsonb_path_match</function> function), and should not be used
+       with the <literal>@?</literal> operator (or
+       <function>jsonb_path_exists</function> function).
+      </para>
+
+      <para>
+       Conversely, non-predicate <type>jsonpath</type> expressions should not be
+       used with the <literal>@@</literal> operator (or the
+       <function>jsonb_path_match</function> function).
+      </para>
+     </note>
+    </sect4>
+
+    <sect4 id="jsonpath-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>
+   </sect3>
 
    <sect3 id="strict-and-lax-modes">
    <title>Strict and Lax Modes</title>
@@ -17430,40 +17504,99 @@ $.track.segments[*].HR &lt; 70
     For example, when querying the GPS data listed above, you can
     abstract from the fact that it stores an array of segments
     when using the lax mode:
-<programlisting>
-lax $.track.segments.location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments.location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
    <para>
-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of
     the queried JSON document to return an SQL/JSON item, so using this
-    path expression will cause an error. To get the same result as in
-    the lax mode, you have to explicitly unwrap the
+    path expression will cause an error:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.location');</userinput>
+ERROR:  jsonpath member accessor can only be applied to an object
+</screen>
+    To get the same result as in the lax mode, you have to explicitly unwrap the
     <literal>segments</literal> array:
-<programlisting>
-strict $.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
-   <para>
-    The <literal>.**</literal> accessor can lead to surprising results
-    when using the lax mode. For instance, the following query selects every
-    <literal>HR</literal> value twice:
-<programlisting>
-lax $.**.HR
-</programlisting>
-    This happens because the <literal>.**</literal> accessor selects both
-    the <literal>segments</literal> array and each of its elements, while
-    the <literal>.HR</literal> accessor automatically unwraps arrays when
-    using the lax mode. To avoid surprising results, we recommend using
-    the <literal>.**</literal> accessor only in the strict mode. The
-    following query selects each <literal>HR</literal> value just once:
-<programlisting>
-strict $.**.HR
-</programlisting>
-   </para>
+   <sect4 id="lax-mode-unwrapping-results">
+   <title>Suprising Results from Unwrapping</title>
+    <para>
+     The unwrapping behavior of lax mode can lead to surprising results. For
+     instance, the following query using the <literal>.**</literal> accessor
+     selects every <literal>HR</literal> value twice:
+ <screen>
+ <prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.**.HR');</userinput>
+  jsonb_path_query
+ ------------------
+  73
+  135
+  73
+  135
+ </screen>
+     This happens because the <literal>.**</literal> accessor selects both
+     the <literal>segments</literal> array and each of its elements, while
+     the <literal>.HR</literal> accessor automatically unwraps arrays when
+     using the lax mode. To avoid surprising results, we recommend using
+     the <literal>.**</literal> accessor only in the strict mode. The
+     following query selects each <literal>HR</literal> value just once:
+ <screen>
+ <prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.**.HR');</userinput>
+  jsonb_path_query
+ ------------------
+  73
+  135
+ </screen>
+    </para>
+
+    <para>
+     The unwrapping of arrays can also lead to unexpected results. Consider this
+     example, which selects all the <literal>location</literal> arrays:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+(2 rows)
+</screen>
+     As expected it returns the full arrays. But applying a filter expression
+     causes the arrays to be unwrapped to evaluate each item, returning only the
+     items that match the expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments[*].location ?(@[*] &gt; 15)');</userinput>
+ jsonb_path_query
+------------------
+ 47.763
+ 47.706
+(2 rows)
+</screen>
+     This despite the fact that the full arrays are selected by the path
+     expression. Use strict mode to restore selecting the arrays:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location ?(@[*] &gt; 15)');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+(2 rows)
+</screen>
+    </para>
 
+   </sect4>
    </sect3>
 
    <sect3 id="functions-sqljson-path-operators">
-- 
2.42.0

#9jian he
jian.universality@gmail.com
In reply to: David E. Wheeler (#8)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Tue, Oct 17, 2023 at 10:56 AM David E. Wheeler <david@justatheory.com> wrote:

Oh, I thought it would report issues from the files they were found in. You’re right, I forgot a title. Fixed in v4.

David

+        Returns the result of a JSON path
+        <link linkend="boolean-predicate-path-expressions">predicate
+        check</link> for the specified JSON value. If the result is
not Boolean,
+        then <literal>NULL</literal> is returned. Do not use with non-predicate
+        JSON path expressions.

"Do not use with non-predicate", double negative is not easy to
comprehend. Maybe we can simplify it.

16933: value. Use only SQL-standard JSON path expressions, not not
there are two "not".

15842: SQL-standard JSON path expressions, not not
there are two "not".

#10David E. Wheeler
david@justatheory.com
In reply to: jian he (#9)
1 attachment(s)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 19, 2023, at 01:22, jian he <jian.universality@gmail.com> wrote:

"Do not use with non-predicate", double negative is not easy to
comprehend. Maybe we can simplify it.

16933: value. Use only SQL-standard JSON path expressions, not not
there are two "not".

15842: SQL-standard JSON path expressions, not not
there are two "not”.

Thank you, jian. Updated patch attached and also on GitHub.

https://github.com/postgres/postgres/compare/master...theory:postgres:jsonpath-pred-docs

Best,

David

Attachments:

v5-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/applefile; name=v5-0001-Improve-boolean-predicate-JSON-Path-docs.patchDownload
	2
<6v5-0001-Improve-boolean-predicate-JSON-Path-docs.patch
#11Erik Wienhold
ewie@mailbox.org
In reply to: David E. Wheeler (#10)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-19 15:39 +0200, David E. Wheeler wrote:

On Oct 19, 2023, at 01:22, jian he <jian.universality@gmail.com> wrote:

Updated patch attached and also on GitHub.

https://github.com/postgres/postgres/compare/master...theory:postgres:jsonpath-pred-docs

Just wanted to take a look at v5. But it's an applefile again :P

--
Erik

#12David E. Wheeler
david@justatheory.com
In reply to: Erik Wienhold (#11)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 19, 2023, at 10:49 PM, Erik Wienhold <ewie@mailbox.org> wrote:

Just wanted to take a look at v5. But it's an applefile again :P

I don’t get it. It was the other times too! Are you able to save it with a .patch suffix?

D

#13Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#12)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-20 05:20 +0200, David E. Wheeler wrote:

On Oct 19, 2023, at 10:49 PM, Erik Wienhold <ewie@mailbox.org> wrote:

Just wanted to take a look at v5. But it's an applefile again :P

I don’t get it. It was the other times too! Are you able to save it
with a .patch suffix?

Saving it is not the problem, but the actual file contents:

$ xxd v5-0001-Improve-boolean-predicate-JSON-Path-docs.patch
00000000: 0005 1600 0002 0000 0000 0000 0000 0000 ................
00000010: 0000 0000 0000 0000 0002 0000 0009 0000 ................
00000020: 0032 0000 000a 0000 0003 0000 003c 0000 .2...........<..
00000030: 0036 0000 0000 0000 0000 0000 7635 2d30 .6..........v5-0
00000040: 3030 312d 496d 7072 6f76 652d 626f 6f6c 001-Improve-bool
00000050: 6561 6e2d 7072 6564 6963 6174 652d 4a53 ean-predicate-JS
00000060: 4f4e 2d50 6174 682d 646f 6373 2e70 6174 ON-Path-docs.pat
00000070: 6368 ch

I don't even know what that represents, probably not some fancy file
compression.

--
Erik

#14David Wheeler
david@justatheory.com
In reply to: Erik Wienhold (#13)
1 attachment(s)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 19, 2023, at 23:49, Erik Wienhold <ewie@ewie.name> wrote:

I don't even know what that represents, probably not some fancy file
compression.

Oh, weird. Trying from a webmail client instead.

Best,

David

Attachments:

v5-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/octet-stream; name="=?UTF-8?Q?v5-0001-Improve-boolean-predicate-JSON-Path-docs.patch?="Download
From 5cb4b90ab399395399ea9fe1556801000d316e21 Mon Sep 17 00:00:00 2001
From: "David E. Wheeler" <david@justatheory.com>
Date: Sat, 14 Oct 2023 16:42:58 -0400
Subject: [PATCH v5] Improve boolean predicate JSON Path docs

Following up from a suggestion from Tom Lane[1] to improve the
documentation of boolean predicate JSON path expressions, please find
enclosed a draft patch to do so. It does three things:

1. Converts all of the example path queries to use `jsonb_path_query()`
   and show the results, to make it clearer what the behaviors are.

2. Replaces the list of deviations from the standards with a new
   subsection, with each deviation in its own sub-subsection. The regex
   section is unchanged, but I've greatly expanded the boolean
   expression JSON path section with examples comparing standard filter
   expressions and nonstandard boolean predicates. I've also added an
   exhortation not use boolean expressions with @? or standard path
   expressions with @@.

3. While converting the modes section to use `jsonb_path_query()` and
   show the results, I also added an example of strict mode returning an
   error and a section demonstrating the difference in query results
   when using lax vs. strict mode.

4. Removes the notes about "only the first value" in the `@@` and
   `jsonb_path_match()` docs, and noted that they support only predicate
   JSON path expressions. Also noted that `json_path_exists()` supports
   only SQL standard path expressions --- in other words, *not*
   predicate path queries.
---
 doc/src/sgml/func.sgml | 323 +++++++++++++++++++++++++++++------------
 1 file changed, 229 insertions(+), 94 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7c3e940afe..a157b00a5d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15838,7 +15838,10 @@ table2-mapping
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Does JSON path return any item for the specified JSON value?
+        Does JSON path return any item for the specified JSON value? Use only
+        SQL-standard JSON path expressions, not
+        <link linkend="boolean-predicate-path-expressions">predicate check
+        expressions.</link>
        </para>
        <para>
         <literal>'{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 2)'</literal>
@@ -15852,10 +15855,12 @@ table2-mapping
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Returns the result of a JSON path predicate check for the
-        specified JSON value.  Only the first item of the result is taken into
-        account.  If the result is not Boolean, then <literal>NULL</literal>
-        is returned.
+        Returns the result of a JSON path
+        <link linkend="boolean-predicate-path-expressions">predicate
+        check</link> for the specified JSON value. If the result is not Boolean,
+        then <literal>NULL</literal> is returned. Use only with
+        <link linkend="boolean-predicate-path-expressions">predicate check
+        expressions.</link>
        </para>
        <para>
         <literal>'{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2'</literal>
@@ -16926,7 +16931,9 @@ array w/o UK? | t
        </para>
        <para>
         Checks whether the JSON path returns any item for the specified JSON
-        value.
+        value. Use only SQL-standard JSON path expressions, not
+        <link linkend="boolean-predicate-path-expressions">predicate check
+        expressions.</link>
         If the <parameter>vars</parameter> argument is specified, it must
         be a JSON object, and its fields provide named values to be
         substituted into the <type>jsonpath</type> expression.
@@ -16949,12 +16956,14 @@ array w/o UK? | t
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Returns the result of a JSON path predicate check for the specified
-        JSON value.  Only the first item of the result is taken into account.
-        If the result is not Boolean, then <literal>NULL</literal> is returned.
-        The optional <parameter>vars</parameter>
-        and <parameter>silent</parameter> arguments act the same as
-        for <function>jsonb_path_exists</function>.
+        Returns the result of a JSON path
+        <link linkend="boolean-predicate-path-expressions">predicate
+        check</link> for the specified JSON value. If the result is not Boolean,
+        then <literal>NULL</literal> is returned. Use only with
+        <link linkend="boolean-predicate-path-expressions">predicate check
+        expressions.</link> The optional <parameter>vars</parameter> and
+        <parameter>silent</parameter> arguments act the same as for
+        <function>jsonb_path_exists</function>.
        </para>
        <para>
         <literal>jsonb_path_match('{"a":[1,2,3,4,5]}', 'exists($.a[*] ? (@ >= $min &amp;&amp; @ &lt;= $max))', '{"min":2, "max":4}')</literal>
@@ -16972,8 +16981,12 @@ array w/o UK? | t
        </para>
        <para>
         Returns all JSON items returned by the JSON path for the specified
-        JSON value.
-        The optional <parameter>vars</parameter>
+        JSON value. For SQL-standard JSON path expressions it returns the JSON
+        values selected from <parameter>target</parameter>. For
+        <link linkend="boolean-predicate-path-expressions">predicate check</link>
+        Path expressions it returns the result of the predicate check:
+        <literal>true</literal>, <literal>false</literal>, or
+        <literal>null</literal>. The optional <parameter>vars</parameter>
         and <parameter>silent</parameter> arguments act the same as
         for <function>jsonb_path_exists</function>.
        </para>
@@ -17203,9 +17216,12 @@ array w/o UK? | t
 
   <para>
    For example, suppose you have some JSON data from a GPS tracker that you
-   would like to parse, such as:
+   would like to parse, such as this JSON, set up as a
+   <link linkend="app-psql-meta-command-set"><application>psql</application>
+   <command>\set</command> variable</link> for use as <literal>:'json'</literal>
+   in the examples below:
 <programlisting>
-{
+ \set json '{
   "track": {
     "segments": [
       {
@@ -17220,7 +17236,7 @@ array w/o UK? | t
       }
     ]
   }
-}
+}'
 </programlisting>
   </para>
 
@@ -17228,9 +17244,13 @@ array w/o UK? | t
    To retrieve the available track segments, you need to use the
    <literal>.<replaceable>key</replaceable></literal> accessor
    operator to descend through surrounding JSON objects:
-<programlisting>
-$.track.segments
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
+select jsonb_path_query(:'json', '$.track.segments');
+                                                                         jsonb_path_query
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
+</screen>
   </para>
 
   <para>
@@ -17238,18 +17258,25 @@ $.track.segments
    <literal>[*]</literal> operator. For example,
    the following path will return the location coordinates for all
    the available track segments:
-<programlisting>
-$.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
   </para>
 
   <para>
    To return the coordinates of the first segment only, you can
    specify the corresponding subscript in the <literal>[]</literal>
    accessor operator. Recall that JSON array indexes are 0-relative:
-<programlisting>
-$.track.segments[0].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[0].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+</screen>
   </para>
 
   <para>
@@ -17258,9 +17285,12 @@ $.track.segments[0].location
    listed in <xref linkend="functions-sqljson-path-operators"/>.
    Each method name must be preceded by a dot. For example,
    you can get the size of an array:
-<programlisting>
-$.track.segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    More examples of using <type>jsonpath</type> operators
    and methods within path expressions appear below in
    <xref linkend="functions-sqljson-path-operators"/>.
@@ -17301,9 +17331,12 @@ $.track.segments.size()
   <para>
    For example, suppose you would like to retrieve all heart rate values higher
    than 130. You can achieve this using the following expression:
-<programlisting>
-$.track.segments[*].HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
@@ -17311,65 +17344,108 @@ $.track.segments[*].HR ? (@ &gt; 130)
    filter out irrelevant segments before returning the start times, so the
    filter expression is applied to the previous step, and the path used
    in the condition is different:
-<programlisting>
-$.track.segments[*] ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    You can use several filter expressions in sequence, if required. For
    example, the following expression selects start times of all segments that
    contain locations with relevant coordinates and high heart rate values:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    Using filter expressions at different nesting levels is also allowed.
    The following example first filters all segments by location, and then
    returns high heart rate values for these segments, if available:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
    You can also nest filter expressions within each other:
-<programlisting>
-$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    This expression returns the size of the track if it contains any
    segments with high heart rate values, or an empty sequence otherwise.
   </para>
 
-  <para>
-   <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
-   language has the following deviations from the SQL/JSON standard:
-  </para>
+  <sect3 id="deviations-from-the-standard">
+  <title>Deviations from the SQL Standard</title>
+   <para>
+    <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
+    language has the following deviations from the SQL/JSON standard.
+   </para>
 
-  <itemizedlist>
-   <listitem>
+   <sect4 id="boolean-predicate-path-expressions">
+   <title>Boolean Predicate Path Expressions</title>
     <para>
-     A path expression can be a Boolean predicate, although the SQL/JSON
-     standard allows predicates only in filters.  This is necessary for
-     implementation of the <literal>@@</literal> operator. For example,
-     the following <type>jsonpath</type> expression is valid in
-     <productname>PostgreSQL</productname>:
-<programlisting>
-$.track.segments[*].HR &lt; 70
-</programlisting>
-    </para>
-   </listitem>
+     As an extension to the SQL standard, a <productname>PostgreSQL</productname>
+     path expression can be a Boolean predicate, whereas the SQL standard allows
+     predicates only in filters. Where SQL standard path expressions return the
+     relevant contents of the queried JSON value, predicate path expressions
+     return the three-valued result of the predicate: <literal>true</literal>,
+     <literal>false</literal>, or <literal>unknown</literal>. Compare this
+     filter <type>jsonpath</type> expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments ?(@[*].HR &gt; 130)');</userinput>
+                                jsonb_path_query
+---------------------------------------------------------------------------------
+ {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}
+</screen>
+     To a predicate expression, which returns <literal>true</literal>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR &gt; 130');</userinput>
+ jsonb_path_query
+------------------
+ true
+</screen>
+     </para>
 
-   <listitem>
-    <para>
-     There are minor differences in the interpretation of regular
-     expression patterns used in <literal>like_regex</literal> filters, as
-     described in <xref linkend="jsonpath-regular-expressions"/>.
-    </para>
-   </listitem>
-  </itemizedlist>
+     <note>
+      <para>
+       Predicate-only path expressions are necessary for implementation of the
+       <literal>@@</literal> operator (and the
+       <function>jsonb_path_match</function> function), and should not be used
+       with the <literal>@?</literal> operator (or
+       <function>jsonb_path_exists</function> function).
+      </para>
+
+      <para>
+       Conversely, non-predicate <type>jsonpath</type> expressions should not be
+       used with the <literal>@@</literal> operator (or the
+       <function>jsonb_path_match</function> function).
+      </para>
+     </note>
+    </sect4>
+
+    <sect4 id="jsonpath-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>
+   </sect3>
 
    <sect3 id="strict-and-lax-modes">
    <title>Strict and Lax Modes</title>
@@ -17430,40 +17506,99 @@ $.track.segments[*].HR &lt; 70
     For example, when querying the GPS data listed above, you can
     abstract from the fact that it stores an array of segments
     when using the lax mode:
-<programlisting>
-lax $.track.segments.location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments.location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
    <para>
-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of
     the queried JSON document to return an SQL/JSON item, so using this
-    path expression will cause an error. To get the same result as in
-    the lax mode, you have to explicitly unwrap the
+    path expression will cause an error:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.location');</userinput>
+ERROR:  jsonpath member accessor can only be applied to an object
+</screen>
+    To get the same result as in the lax mode, you have to explicitly unwrap the
     <literal>segments</literal> array:
-<programlisting>
-strict $.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
-   <para>
-    The <literal>.**</literal> accessor can lead to surprising results
-    when using the lax mode. For instance, the following query selects every
-    <literal>HR</literal> value twice:
-<programlisting>
-lax $.**.HR
-</programlisting>
-    This happens because the <literal>.**</literal> accessor selects both
-    the <literal>segments</literal> array and each of its elements, while
-    the <literal>.HR</literal> accessor automatically unwraps arrays when
-    using the lax mode. To avoid surprising results, we recommend using
-    the <literal>.**</literal> accessor only in the strict mode. The
-    following query selects each <literal>HR</literal> value just once:
-<programlisting>
-strict $.**.HR
-</programlisting>
-   </para>
+   <sect4 id="lax-mode-unwrapping-results">
+   <title>Suprising Results from Unwrapping</title>
+    <para>
+     The unwrapping behavior of lax mode can lead to surprising results. For
+     instance, the following query using the <literal>.**</literal> accessor
+     selects every <literal>HR</literal> value twice:
+ <screen>
+ <prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.**.HR');</userinput>
+  jsonb_path_query
+ ------------------
+  73
+  135
+  73
+  135
+ </screen>
+     This happens because the <literal>.**</literal> accessor selects both
+     the <literal>segments</literal> array and each of its elements, while
+     the <literal>.HR</literal> accessor automatically unwraps arrays when
+     using the lax mode. To avoid surprising results, we recommend using
+     the <literal>.**</literal> accessor only in the strict mode. The
+     following query selects each <literal>HR</literal> value just once:
+ <screen>
+ <prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.**.HR');</userinput>
+  jsonb_path_query
+ ------------------
+  73
+  135
+ </screen>
+    </para>
+
+    <para>
+     The unwrapping of arrays can also lead to unexpected results. Consider this
+     example, which selects all the <literal>location</literal> arrays:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+(2 rows)
+</screen>
+     As expected it returns the full arrays. But applying a filter expression
+     causes the arrays to be unwrapped to evaluate each item, returning only the
+     items that match the expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments[*].location ?(@[*] &gt; 15)');</userinput>
+ jsonb_path_query
+------------------
+ 47.763
+ 47.706
+(2 rows)
+</screen>
+     This despite the fact that the full arrays are selected by the path
+     expression. Use strict mode to restore selecting the arrays:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location ?(@[*] &gt; 15)');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+(2 rows)
+</screen>
+    </para>
 
+   </sect4>
    </sect3>
 
    <sect3 id="functions-sqljson-path-operators">
-- 
2.42.0

#15Erik Wienhold
ewie@ewie.name
In reply to: David Wheeler (#14)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-20 15:49 +0200, David Wheeler wrote:

On Oct 19, 2023, at 23:49, Erik Wienhold <ewie@ewie.name> wrote:

I don't even know what that represents, probably not some fancy file
compression.

That's an AppleSingle file according to [1]https://www.rfc-editor.org/rfc/rfc1740.txt[2]https://web.archive.org/web/20180311140826/http://kaiser-edv.de/documents/AppleSingle_AppleDouble.pdf. It only contains the
resource fork and file name but no data fork.

Oh, weird. Trying from a webmail client instead.

Thanks.

+        Does JSON path return any item for the specified JSON value? Use only
+        SQL-standard JSON path expressions, not
+        <link linkend="boolean-predicate-path-expressions">predicate check
+        expressions.</link>

Any reason for calling it "predicate check expressions" (e.g. the link
text) and sometimes "predicate path expressions" (e.g. the linked
section title)? I think it should be named consistently to avoid
confusion and also to simplify searching.

+        Returns the result of a JSON path
+        <link linkend="boolean-predicate-path-expressions">predicate
+        check</link> for the specified JSON value. If the result is not Boolean,
+        then <literal>NULL</literal> is returned. Use only with
+        <link linkend="boolean-predicate-path-expressions">predicate check
+        expressions.</link>

Linking the same section twice in the same paragraph seems excessive.

+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
+select jsonb_path_query(:'json', '$.track.segments');

Please remove the second SELECT.

+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[0].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]

Strict mode is unnecessary to get that result and I'd omit it because
the different modes are not introduced yet at this point.

+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2

Strict mode is unnecessary here as well.

+     using the lax mode. To avoid surprising results, we recommend using
+     the <literal>.**</literal> accessor only in the strict mode. The

Please change to "in strict mode" (without "the").

[1]: https://www.rfc-editor.org/rfc/rfc1740.txt
[2]: https://web.archive.org/web/20180311140826/http://kaiser-edv.de/documents/AppleSingle_AppleDouble.pdf

--
Erik

#16David E. Wheeler
david@justatheory.com
In reply to: Erik Wienhold (#15)
1 attachment(s)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 22, 2023, at 20:36, Erik Wienhold <ewie@ewie.name> wrote:

That's an AppleSingle file according to [1][2]. It only contains the
resource fork and file name but no data fork.

Ah, I had “Send large attachments with Mail Drop” enabled. To me 20K is not big but whatever. Let’s see if turning it off fixes the issue.

Any reason for calling it "predicate check expressions" (e.g. the link
text) and sometimes "predicate path expressions" (e.g. the linked
section title)? I think it should be named consistently to avoid
confusion and also to simplify searching.

I think "predicate path expressions” is more descriptive, but "predicate check expressions” is what was in the docs before, so let’s stick with that.

Linking the same section twice in the same paragraph seems excessive.

Fair. Will link the second one.

+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
+select jsonb_path_query(:'json', '$.track.segments');

Please remove the second SELECT.

Done.

+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[0].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]

Strict mode is unnecessary to get that result and I'd omit it because
the different modes are not introduced yet at this point.

Yep, pasto.

Strict mode is unnecessary here as well.

Fixed.

+     using the lax mode. To avoid surprising results, we recommend using
+     the <literal>.**</literal> accessor only in the strict mode. The

Please change to "in strict mode" (without "the").

Hrm, I prefer it without the article, too, but it is consistently used that way elsewhere, like here:

https://github.com/postgres/postgres/blob/5b36e8f/doc/src/sgml/func.sgml#L17401

I’d be happy to change them all, but was keeping it consistent for now.

Updated patch attached, thank you!

David

Attachments:

v6-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/octet-stream; name=v6-0001-Improve-boolean-predicate-JSON-Path-docs.patch; x-unix-mode=0644Download
From a7048d66859d70095f1cd32bb7016857458d78cf Mon Sep 17 00:00:00 2001
From: "David E. Wheeler" <david@justatheory.com>
Date: Sat, 14 Oct 2023 16:42:58 -0400
Subject: [PATCH v6] Improve boolean predicate JSON Path docs

Following up from a suggestion from Tom Lane[1] to improve the
documentation of boolean predicate JSON path expressions, please find
enclosed a draft patch to do so. It does three things:

1. Converts all of the example path queries to use `jsonb_path_query()`
   and show the results, to make it clearer what the behaviors are.

2. Replaces the list of deviations from the standards with a new
   subsection, with each deviation in its own sub-subsection. The regex
   section is unchanged, but I've greatly expanded the boolean
   expression JSON path section with examples comparing standard filter
   expressions and nonstandard boolean predicates. I've also added an
   exhortation not use boolean expressions with @? or standard path
   expressions with @@.

3. While converting the modes section to use `jsonb_path_query()` and
   show the results, I also added an example of strict mode returning an
   error and a section demonstrating the difference in query results
   when using lax vs. strict mode.

4. Removes the notes about "only the first value" in the `@@` and
   `jsonb_path_match()` docs, and noted that they support only predicate
   JSON path expressions. Also noted that `json_path_exists()` supports
   only SQL standard path expressions --- in other words, *not*
   predicate path queries.
---
 doc/src/sgml/func.sgml | 320 +++++++++++++++++++++++++++++------------
 1 file changed, 226 insertions(+), 94 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7c3e940afe..30af2b39ec 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15838,7 +15838,10 @@ table2-mapping
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Does JSON path return any item for the specified JSON value?
+        Does JSON path return any item for the specified JSON value? Use only
+        SQL-standard JSON path expressions, not
+        <link linkend="boolean-predicate-check-expressions">predicate check
+        expressions.</link>
        </para>
        <para>
         <literal>'{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 2)'</literal>
@@ -15852,10 +15855,11 @@ table2-mapping
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Returns the result of a JSON path predicate check for the
-        specified JSON value.  Only the first item of the result is taken into
-        account.  If the result is not Boolean, then <literal>NULL</literal>
-        is returned.
+        Returns the result of a JSON path predicate check for the specified JSON
+        value. If the result is not Boolean, then <literal>NULL</literal> is
+        returned. Use only with
+        <link linkend="boolean-predicate-check-expressions">predicate check
+        expressions.</link>
        </para>
        <para>
         <literal>'{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2'</literal>
@@ -16926,7 +16930,9 @@ array w/o UK? | t
        </para>
        <para>
         Checks whether the JSON path returns any item for the specified JSON
-        value.
+        value. Use only SQL-standard JSON path expressions, not
+        <link linkend="boolean-predicate-check-expressions">predicate check
+        expressions.</link>
         If the <parameter>vars</parameter> argument is specified, it must
         be a JSON object, and its fields provide named values to be
         substituted into the <type>jsonpath</type> expression.
@@ -16949,12 +16955,13 @@ array w/o UK? | t
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Returns the result of a JSON path predicate check for the specified
-        JSON value.  Only the first item of the result is taken into account.
-        If the result is not Boolean, then <literal>NULL</literal> is returned.
-        The optional <parameter>vars</parameter>
-        and <parameter>silent</parameter> arguments act the same as
-        for <function>jsonb_path_exists</function>.
+        Returns the result of a JSON path predicate check for the specified JSON
+        value. If the result is not Boolean, then <literal>NULL</literal> is
+        returned. Use only with
+        <link linkend="boolean-predicate-check-expressions">predicate check
+        expressions.</link> The optional <parameter>vars</parameter> and
+        <parameter>silent</parameter> arguments act the same as for
+        <function>jsonb_path_exists</function>.
        </para>
        <para>
         <literal>jsonb_path_match('{"a":[1,2,3,4,5]}', 'exists($.a[*] ? (@ >= $min &amp;&amp; @ &lt;= $max))', '{"min":2, "max":4}')</literal>
@@ -16972,8 +16979,12 @@ array w/o UK? | t
        </para>
        <para>
         Returns all JSON items returned by the JSON path for the specified
-        JSON value.
-        The optional <parameter>vars</parameter>
+        JSON value. For SQL-standard JSON path expressions it returns the JSON
+        values selected from <parameter>target</parameter>. For
+        <link linkend="boolean-predicate-check-expressions">predicate check</link>
+        Path expressions it returns the result of the predicate check:
+        <literal>true</literal>, <literal>false</literal>, or
+        <literal>null</literal>. The optional <parameter>vars</parameter>
         and <parameter>silent</parameter> arguments act the same as
         for <function>jsonb_path_exists</function>.
        </para>
@@ -17203,9 +17214,12 @@ array w/o UK? | t
 
   <para>
    For example, suppose you have some JSON data from a GPS tracker that you
-   would like to parse, such as:
+   would like to parse, such as this JSON, set up as a
+   <link linkend="app-psql-meta-command-set"><application>psql</application>
+   <command>\set</command> variable</link> for use as <literal>:'json'</literal>
+   in the examples below:
 <programlisting>
-{
+ \set json '{
   "track": {
     "segments": [
       {
@@ -17220,7 +17234,7 @@ array w/o UK? | t
       }
     ]
   }
-}
+}'
 </programlisting>
   </para>
 
@@ -17228,9 +17242,12 @@ array w/o UK? | t
    To retrieve the available track segments, you need to use the
    <literal>.<replaceable>key</replaceable></literal> accessor
    operator to descend through surrounding JSON objects:
-<programlisting>
-$.track.segments
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
+                                                                         jsonb_path_query
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
+</screen>
   </para>
 
   <para>
@@ -17238,18 +17255,25 @@ $.track.segments
    <literal>[*]</literal> operator. For example,
    the following path will return the location coordinates for all
    the available track segments:
-<programlisting>
-$.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
   </para>
 
   <para>
    To return the coordinates of the first segment only, you can
    specify the corresponding subscript in the <literal>[]</literal>
    accessor operator. Recall that JSON array indexes are 0-relative:
-<programlisting>
-$.track.segments[0].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[0].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+</screen>
   </para>
 
   <para>
@@ -17258,9 +17282,12 @@ $.track.segments[0].location
    listed in <xref linkend="functions-sqljson-path-operators"/>.
    Each method name must be preceded by a dot. For example,
    you can get the size of an array:
-<programlisting>
-$.track.segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    More examples of using <type>jsonpath</type> operators
    and methods within path expressions appear below in
    <xref linkend="functions-sqljson-path-operators"/>.
@@ -17301,9 +17328,12 @@ $.track.segments.size()
   <para>
    For example, suppose you would like to retrieve all heart rate values higher
    than 130. You can achieve this using the following expression:
-<programlisting>
-$.track.segments[*].HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
@@ -17311,65 +17341,108 @@ $.track.segments[*].HR ? (@ &gt; 130)
    filter out irrelevant segments before returning the start times, so the
    filter expression is applied to the previous step, and the path used
    in the condition is different:
-<programlisting>
-$.track.segments[*] ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    You can use several filter expressions in sequence, if required. For
    example, the following expression selects start times of all segments that
    contain locations with relevant coordinates and high heart rate values:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    Using filter expressions at different nesting levels is also allowed.
    The following example first filters all segments by location, and then
    returns high heart rate values for these segments, if available:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
    You can also nest filter expressions within each other:
-<programlisting>
-$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    This expression returns the size of the track if it contains any
    segments with high heart rate values, or an empty sequence otherwise.
   </para>
 
-  <para>
-   <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
-   language has the following deviations from the SQL/JSON standard:
-  </para>
+  <sect3 id="deviations-from-the-standard">
+  <title>Deviations from the SQL Standard</title>
+   <para>
+    <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
+    language has the following deviations from the SQL/JSON standard.
+   </para>
 
-  <itemizedlist>
-   <listitem>
+   <sect4 id="boolean-predicate-check-expressions">
+   <title>Boolean Predicate Check Expressions</title>
     <para>
-     A path expression can be a Boolean predicate, although the SQL/JSON
-     standard allows predicates only in filters.  This is necessary for
-     implementation of the <literal>@@</literal> operator. For example,
-     the following <type>jsonpath</type> expression is valid in
-     <productname>PostgreSQL</productname>:
-<programlisting>
-$.track.segments[*].HR &lt; 70
-</programlisting>
-    </para>
-   </listitem>
+     As an extension to the SQL standard, a <productname>PostgreSQL</productname>
+     path expression can be a Boolean predicate, whereas the SQL standard allows
+     predicates only in filters. Where SQL standard path expressions return the
+     relevant contents of the queried JSON value, predicate check expressions
+     return the three-valued result of the predicate: <literal>true</literal>,
+     <literal>false</literal>, or <literal>unknown</literal>. Compare this
+     filter <type>jsonpath</type> expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments ?(@[*].HR &gt; 130)');</userinput>
+                                jsonb_path_query
+---------------------------------------------------------------------------------
+ {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}
+</screen>
+     To a predicate expression, which returns <literal>true</literal>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR &gt; 130');</userinput>
+ jsonb_path_query
+------------------
+ true
+</screen>
+     </para>
 
-   <listitem>
-    <para>
-     There are minor differences in the interpretation of regular
-     expression patterns used in <literal>like_regex</literal> filters, as
-     described in <xref linkend="jsonpath-regular-expressions"/>.
-    </para>
-   </listitem>
-  </itemizedlist>
+     <note>
+      <para>
+       Predicate-only path expressions are necessary for implementation of the
+       <literal>@@</literal> operator (and the
+       <function>jsonb_path_match</function> function), and should not be used
+       with the <literal>@?</literal> operator (or
+       <function>jsonb_path_exists</function> function).
+      </para>
+
+      <para>
+       Conversely, non-predicate <type>jsonpath</type> expressions should not be
+       used with the <literal>@@</literal> operator (or the
+       <function>jsonb_path_match</function> function).
+      </para>
+     </note>
+    </sect4>
+
+    <sect4 id="jsonpath-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>
+   </sect3>
 
    <sect3 id="strict-and-lax-modes">
    <title>Strict and Lax Modes</title>
@@ -17430,40 +17503,99 @@ $.track.segments[*].HR &lt; 70
     For example, when querying the GPS data listed above, you can
     abstract from the fact that it stores an array of segments
     when using the lax mode:
-<programlisting>
-lax $.track.segments.location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments.location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
    <para>
-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of
     the queried JSON document to return an SQL/JSON item, so using this
-    path expression will cause an error. To get the same result as in
-    the lax mode, you have to explicitly unwrap the
+    path expression will cause an error:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.location');</userinput>
+ERROR:  jsonpath member accessor can only be applied to an object
+</screen>
+    To get the same result as in the lax mode, you have to explicitly unwrap the
     <literal>segments</literal> array:
-<programlisting>
-strict $.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
-   <para>
-    The <literal>.**</literal> accessor can lead to surprising results
-    when using the lax mode. For instance, the following query selects every
-    <literal>HR</literal> value twice:
-<programlisting>
-lax $.**.HR
-</programlisting>
-    This happens because the <literal>.**</literal> accessor selects both
-    the <literal>segments</literal> array and each of its elements, while
-    the <literal>.HR</literal> accessor automatically unwraps arrays when
-    using the lax mode. To avoid surprising results, we recommend using
-    the <literal>.**</literal> accessor only in the strict mode. The
-    following query selects each <literal>HR</literal> value just once:
-<programlisting>
-strict $.**.HR
-</programlisting>
-   </para>
+   <sect4 id="lax-mode-unwrapping-results">
+   <title>Suprising Results from Unwrapping</title>
+    <para>
+     The unwrapping behavior of lax mode can lead to surprising results. For
+     instance, the following query using the <literal>.**</literal> accessor
+     selects every <literal>HR</literal> value twice:
+ <screen>
+ <prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.**.HR');</userinput>
+  jsonb_path_query
+ ------------------
+  73
+  135
+  73
+  135
+ </screen>
+     This happens because the <literal>.**</literal> accessor selects both
+     the <literal>segments</literal> array and each of its elements, while
+     the <literal>.HR</literal> accessor automatically unwraps arrays when
+     using the lax mode. To avoid surprising results, we recommend using
+     the <literal>.**</literal> accessor only in the strict mode. The
+     following query selects each <literal>HR</literal> value just once:
+ <screen>
+ <prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.**.HR');</userinput>
+  jsonb_path_query
+ ------------------
+  73
+  135
+ </screen>
+    </para>
+
+    <para>
+     The unwrapping of arrays can also lead to unexpected results. Consider this
+     example, which selects all the <literal>location</literal> arrays:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+(2 rows)
+</screen>
+     As expected it returns the full arrays. But applying a filter expression
+     causes the arrays to be unwrapped to evaluate each item, returning only the
+     items that match the expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments[*].location ?(@[*] &gt; 15)');</userinput>
+ jsonb_path_query
+------------------
+ 47.763
+ 47.706
+(2 rows)
+</screen>
+     This despite the fact that the full arrays are selected by the path
+     expression. Use strict mode to restore selecting the arrays:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location ?(@[*] &gt; 15)');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+(2 rows)
+</screen>
+    </para>
 
+   </sect4>
    </sect3>
 
    <sect3 id="functions-sqljson-path-operators">
-- 
2.42.0

#17Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#16)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-24 00:58 +0200, David E. Wheeler wrote:

On Oct 22, 2023, at 20:36, Erik Wienhold <ewie@ewie.name> wrote:

That's an AppleSingle file according to [1][2]. It only contains the
resource fork and file name but no data fork.

Ah, I had “Send large attachments with Mail Drop” enabled. To me 20K
is not big but whatever. Let’s see if turning it off fixes the issue.

I suspected it had something to do with iCloud. Glad you solved it!

Please change to "in strict mode" (without "the").

Hrm, I prefer it without the article, too, but it is consistently used
that way elsewhere, like here:

https://github.com/postgres/postgres/blob/5b36e8f/doc/src/sgml/func.sgml#L17401

I’d be happy to change them all, but was keeping it consistent for now.

Right. I haven't really noticed that the article case is more common.
I thought that you may have missed that one because I saw this change
that removes the article:

-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of

Updated patch attached, thank you!

LGTM. Would you create a commitfest entry? I'll set the status to RfC.

--
Erik

#18David E. Wheeler
david@justatheory.com
In reply to: Erik Wienhold (#17)
1 attachment(s)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 23, 2023, at 20:20, Erik Wienhold <ewie@ewie.name> wrote:

I thought that you may have missed that one because I saw this change
that removes the article:

-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of

Oh, didn’t realize. Fixed.

LGTM. Would you create a commitfest entry? I'll set the status to RfC.

Done.

https://commitfest.postgresql.org/45/4624/

Best,

David

Attachments:

v7-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/octet-stream; name=v7-0001-Improve-boolean-predicate-JSON-Path-docs.patch; x-unix-mode=0644Download
From 039fd010d631660b1933bcc049ed9d757d297589 Mon Sep 17 00:00:00 2001
From: "David E. Wheeler" <david@justatheory.com>
Date: Sat, 14 Oct 2023 16:42:58 -0400
Subject: [PATCH v7] Improve boolean predicate JSON Path docs

Following up from a suggestion from Tom Lane[1] to improve the
documentation of boolean predicate JSON path expressions, please find
enclosed a draft patch to do so. It does three things:

1. Converts all of the example path queries to use `jsonb_path_query()`
   and show the results, to make it clearer what the behaviors are.

2. Replaces the list of deviations from the standards with a new
   subsection, with each deviation in its own sub-subsection. The regex
   section is unchanged, but I've greatly expanded the boolean
   expression JSON path section with examples comparing standard filter
   expressions and nonstandard boolean predicates. I've also added an
   exhortation not use boolean expressions with @? or standard path
   expressions with @@.

3. While converting the modes section to use `jsonb_path_query()` and
   show the results, I also added an example of strict mode returning an
   error and a section demonstrating the difference in query results
   when using lax vs. strict mode.

4. Removes the notes about "only the first value" in the `@@` and
   `jsonb_path_match()` docs, and noted that they support only predicate
   JSON path expressions. Also noted that `json_path_exists()` supports
   only SQL standard path expressions --- in other words, *not*
   predicate path queries.
---
 doc/src/sgml/func.sgml | 322 +++++++++++++++++++++++++++++------------
 1 file changed, 227 insertions(+), 95 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7c3e940afe..78d4e89754 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15838,7 +15838,10 @@ table2-mapping
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Does JSON path return any item for the specified JSON value?
+        Does JSON path return any item for the specified JSON value? Use only
+        SQL-standard JSON path expressions, not
+        <link linkend="boolean-predicate-check-expressions">predicate check
+        expressions.</link>
        </para>
        <para>
         <literal>'{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 2)'</literal>
@@ -15852,10 +15855,11 @@ table2-mapping
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Returns the result of a JSON path predicate check for the
-        specified JSON value.  Only the first item of the result is taken into
-        account.  If the result is not Boolean, then <literal>NULL</literal>
-        is returned.
+        Returns the result of a JSON path predicate check for the specified JSON
+        value. If the result is not Boolean, then <literal>NULL</literal> is
+        returned. Use only with
+        <link linkend="boolean-predicate-check-expressions">predicate check
+        expressions.</link>
        </para>
        <para>
         <literal>'{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2'</literal>
@@ -16926,7 +16930,9 @@ array w/o UK? | t
        </para>
        <para>
         Checks whether the JSON path returns any item for the specified JSON
-        value.
+        value. Use only SQL-standard JSON path expressions, not
+        <link linkend="boolean-predicate-check-expressions">predicate check
+        expressions.</link>
         If the <parameter>vars</parameter> argument is specified, it must
         be a JSON object, and its fields provide named values to be
         substituted into the <type>jsonpath</type> expression.
@@ -16949,12 +16955,13 @@ array w/o UK? | t
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Returns the result of a JSON path predicate check for the specified
-        JSON value.  Only the first item of the result is taken into account.
-        If the result is not Boolean, then <literal>NULL</literal> is returned.
-        The optional <parameter>vars</parameter>
-        and <parameter>silent</parameter> arguments act the same as
-        for <function>jsonb_path_exists</function>.
+        Returns the result of a JSON path predicate check for the specified JSON
+        value. If the result is not Boolean, then <literal>NULL</literal> is
+        returned. Use only with
+        <link linkend="boolean-predicate-check-expressions">predicate check
+        expressions.</link> The optional <parameter>vars</parameter> and
+        <parameter>silent</parameter> arguments act the same as for
+        <function>jsonb_path_exists</function>.
        </para>
        <para>
         <literal>jsonb_path_match('{"a":[1,2,3,4,5]}', 'exists($.a[*] ? (@ >= $min &amp;&amp; @ &lt;= $max))', '{"min":2, "max":4}')</literal>
@@ -16972,8 +16979,12 @@ array w/o UK? | t
        </para>
        <para>
         Returns all JSON items returned by the JSON path for the specified
-        JSON value.
-        The optional <parameter>vars</parameter>
+        JSON value. For SQL-standard JSON path expressions it returns the JSON
+        values selected from <parameter>target</parameter>. For
+        <link linkend="boolean-predicate-check-expressions">predicate check</link>
+        Path expressions it returns the result of the predicate check:
+        <literal>true</literal>, <literal>false</literal>, or
+        <literal>null</literal>. The optional <parameter>vars</parameter>
         and <parameter>silent</parameter> arguments act the same as
         for <function>jsonb_path_exists</function>.
        </para>
@@ -17203,9 +17214,12 @@ array w/o UK? | t
 
   <para>
    For example, suppose you have some JSON data from a GPS tracker that you
-   would like to parse, such as:
+   would like to parse, such as this JSON, set up as a
+   <link linkend="app-psql-meta-command-set"><application>psql</application>
+   <command>\set</command> variable</link> for use as <literal>:'json'</literal>
+   in the examples below:
 <programlisting>
-{
+ \set json '{
   "track": {
     "segments": [
       {
@@ -17220,7 +17234,7 @@ array w/o UK? | t
       }
     ]
   }
-}
+}'
 </programlisting>
   </para>
 
@@ -17228,9 +17242,12 @@ array w/o UK? | t
    To retrieve the available track segments, you need to use the
    <literal>.<replaceable>key</replaceable></literal> accessor
    operator to descend through surrounding JSON objects:
-<programlisting>
-$.track.segments
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
+                                                                         jsonb_path_query
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
+</screen>
   </para>
 
   <para>
@@ -17238,18 +17255,25 @@ $.track.segments
    <literal>[*]</literal> operator. For example,
    the following path will return the location coordinates for all
    the available track segments:
-<programlisting>
-$.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
   </para>
 
   <para>
    To return the coordinates of the first segment only, you can
    specify the corresponding subscript in the <literal>[]</literal>
    accessor operator. Recall that JSON array indexes are 0-relative:
-<programlisting>
-$.track.segments[0].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[0].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+</screen>
   </para>
 
   <para>
@@ -17258,9 +17282,12 @@ $.track.segments[0].location
    listed in <xref linkend="functions-sqljson-path-operators"/>.
    Each method name must be preceded by a dot. For example,
    you can get the size of an array:
-<programlisting>
-$.track.segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    More examples of using <type>jsonpath</type> operators
    and methods within path expressions appear below in
    <xref linkend="functions-sqljson-path-operators"/>.
@@ -17301,9 +17328,12 @@ $.track.segments.size()
   <para>
    For example, suppose you would like to retrieve all heart rate values higher
    than 130. You can achieve this using the following expression:
-<programlisting>
-$.track.segments[*].HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
@@ -17311,65 +17341,108 @@ $.track.segments[*].HR ? (@ &gt; 130)
    filter out irrelevant segments before returning the start times, so the
    filter expression is applied to the previous step, and the path used
    in the condition is different:
-<programlisting>
-$.track.segments[*] ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    You can use several filter expressions in sequence, if required. For
    example, the following expression selects start times of all segments that
    contain locations with relevant coordinates and high heart rate values:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    Using filter expressions at different nesting levels is also allowed.
    The following example first filters all segments by location, and then
    returns high heart rate values for these segments, if available:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
    You can also nest filter expressions within each other:
-<programlisting>
-$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    This expression returns the size of the track if it contains any
    segments with high heart rate values, or an empty sequence otherwise.
   </para>
 
-  <para>
-   <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
-   language has the following deviations from the SQL/JSON standard:
-  </para>
+  <sect3 id="deviations-from-the-standard">
+  <title>Deviations from the SQL Standard</title>
+   <para>
+    <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
+    language has the following deviations from the SQL/JSON standard.
+   </para>
 
-  <itemizedlist>
-   <listitem>
+   <sect4 id="boolean-predicate-check-expressions">
+   <title>Boolean Predicate Check Expressions</title>
     <para>
-     A path expression can be a Boolean predicate, although the SQL/JSON
-     standard allows predicates only in filters.  This is necessary for
-     implementation of the <literal>@@</literal> operator. For example,
-     the following <type>jsonpath</type> expression is valid in
-     <productname>PostgreSQL</productname>:
-<programlisting>
-$.track.segments[*].HR &lt; 70
-</programlisting>
-    </para>
-   </listitem>
+     As an extension to the SQL standard, a <productname>PostgreSQL</productname>
+     path expression can be a Boolean predicate, whereas the SQL standard allows
+     predicates only in filters. Where SQL standard path expressions return the
+     relevant contents of the queried JSON value, predicate check expressions
+     return the three-valued result of the predicate: <literal>true</literal>,
+     <literal>false</literal>, or <literal>unknown</literal>. Compare this
+     filter <type>jsonpath</type> expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments ?(@[*].HR &gt; 130)');</userinput>
+                                jsonb_path_query
+---------------------------------------------------------------------------------
+ {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}
+</screen>
+     To a predicate expression, which returns <literal>true</literal>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR &gt; 130');</userinput>
+ jsonb_path_query
+------------------
+ true
+</screen>
+     </para>
 
-   <listitem>
-    <para>
-     There are minor differences in the interpretation of regular
-     expression patterns used in <literal>like_regex</literal> filters, as
-     described in <xref linkend="jsonpath-regular-expressions"/>.
-    </para>
-   </listitem>
-  </itemizedlist>
+     <note>
+      <para>
+       Predicate-only path expressions are necessary for implementation of the
+       <literal>@@</literal> operator (and the
+       <function>jsonb_path_match</function> function), and should not be used
+       with the <literal>@?</literal> operator (or
+       <function>jsonb_path_exists</function> function).
+      </para>
+
+      <para>
+       Conversely, non-predicate <type>jsonpath</type> expressions should not be
+       used with the <literal>@@</literal> operator (or the
+       <function>jsonb_path_match</function> function).
+      </para>
+     </note>
+    </sect4>
+
+    <sect4 id="jsonpath-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>
+   </sect3>
 
    <sect3 id="strict-and-lax-modes">
    <title>Strict and Lax Modes</title>
@@ -17430,40 +17503,99 @@ $.track.segments[*].HR &lt; 70
     For example, when querying the GPS data listed above, you can
     abstract from the fact that it stores an array of segments
     when using the lax mode:
-<programlisting>
-lax $.track.segments.location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments.location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
    <para>
     In the strict mode, the specified path must exactly match the structure of
-    the queried JSON document to return an SQL/JSON item, so using this
-    path expression will cause an error. To get the same result as in
-    the lax mode, you have to explicitly unwrap the
+    the queried JSON document to return an SQL/JSON item, so using this path
+    expression will cause an error:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.location');</userinput>
+ERROR:  jsonpath member accessor can only be applied to an object
+</screen>
+    To get the same result as in the lax mode, you have to explicitly unwrap the
     <literal>segments</literal> array:
-<programlisting>
-strict $.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
-   <para>
-    The <literal>.**</literal> accessor can lead to surprising results
-    when using the lax mode. For instance, the following query selects every
-    <literal>HR</literal> value twice:
-<programlisting>
-lax $.**.HR
-</programlisting>
-    This happens because the <literal>.**</literal> accessor selects both
-    the <literal>segments</literal> array and each of its elements, while
-    the <literal>.HR</literal> accessor automatically unwraps arrays when
-    using the lax mode. To avoid surprising results, we recommend using
-    the <literal>.**</literal> accessor only in the strict mode. The
-    following query selects each <literal>HR</literal> value just once:
-<programlisting>
-strict $.**.HR
-</programlisting>
-   </para>
+   <sect4 id="lax-mode-unwrapping-results">
+   <title>Suprising Results from Unwrapping</title>
+    <para>
+     The unwrapping behavior of lax mode can lead to surprising results. For
+     instance, the following query using the <literal>.**</literal> accessor
+     selects every <literal>HR</literal> value twice:
+ <screen>
+ <prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.**.HR');</userinput>
+  jsonb_path_query
+ ------------------
+  73
+  135
+  73
+  135
+ </screen>
+     This happens because the <literal>.**</literal> accessor selects both
+     the <literal>segments</literal> array and each of its elements, while
+     the <literal>.HR</literal> accessor automatically unwraps arrays when
+     using the lax mode. To avoid surprising results, we recommend using
+     the <literal>.**</literal> accessor only in the strict mode. The
+     following query selects each <literal>HR</literal> value just once:
+ <screen>
+ <prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.**.HR');</userinput>
+  jsonb_path_query
+ ------------------
+  73
+  135
+ </screen>
+    </para>
+
+    <para>
+     The unwrapping of arrays can also lead to unexpected results. Consider this
+     example, which selects all the <literal>location</literal> arrays:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+(2 rows)
+</screen>
+     As expected it returns the full arrays. But applying a filter expression
+     causes the arrays to be unwrapped to evaluate each item, returning only the
+     items that match the expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments[*].location ?(@[*] &gt; 15)');</userinput>
+ jsonb_path_query
+------------------
+ 47.763
+ 47.706
+(2 rows)
+</screen>
+     This despite the fact that the full arrays are selected by the path
+     expression. Use strict mode to restore selecting the arrays:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location ?(@[*] &gt; 15)');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+(2 rows)
+</screen>
+    </para>
 
+   </sect4>
    </sect3>
 
    <sect3 id="functions-sqljson-path-operators">
@@ -18039,7 +18171,7 @@ strict $.**.HR
         Tests whether a path expression matches at least one SQL/JSON item.
         Returns <literal>unknown</literal> if the path expression would result
         in an error; the second example uses this to avoid a no-such-key error
-        in strict mode.
+        in the strict mode.
        </para>
        <para>
         <literal>jsonb_path_query('{"x": [1, 2], "y": [2, 4]}', 'strict $.* ? (exists (@ ? (@[*] &gt; 2)))')</literal>
-- 
2.42.0

#19shihao zhong
zhong950419@gmail.com
In reply to: David E. Wheeler (#18)
Re: Patch: Improve Boolean Predicate JSON Path Docs

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: tested, passed

I took a look for this commit, it looks correct to me

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#18)
Re: Patch: Improve Boolean Predicate JSON Path Docs

"David E. Wheeler" <david@justatheory.com> writes:

[ v7-0001-Improve-boolean-predicate-JSON-Path-docs.patch ]

I started to review this, and got bogged down at

@@ -17203,9 +17214,12 @@ array w/o UK? | t

   <para>
    For example, suppose you have some JSON data from a GPS tracker that you
-   would like to parse, such as:
+   would like to parse, such as this JSON, set up as a
+   <link linkend="app-psql-meta-command-set"><application>psql</application>
+   <command>\set</command> variable</link> for use as <literal>:'json'</literal>
+   in the examples below:
 <programlisting>
-{
+ \set json '{
   "track": {
     "segments": [
       {

I find the textual change rather unwieldy, but the bigger problem is
that this example doesn't actually work. If you try to copy-and-paste
this into psql, you get "unterminated quoted string", because psql
metacommands can't span line boundaries.

Perhaps we could leave the existing display alone, and then add

To follow the examples below, paste this into psql:
<programlisting>
\set json '{ "track": { "segments": [ { "location": [ 47.763, 13.4034 ], "start time": "2018-10-14 10:05:14", "HR": 73 }, { "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21", "HR": 135 } ] }}'
</programlisting>
This will allow <literal>:'json'</literal> to be expanded into the
above JSON value, plus suitable quoting.

However, I'm not sure that's a great solution, because it's going to
line-wrap on most displays, making copy-and-paste a bit iffy.

I experimented with

SELECT '
... multiline json value ...
' AS json
\gexec

but that didn't seem to work either. Anybody have a better idea?

regards, tom lane

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#20)
Re: Patch: Improve Boolean Predicate JSON Path Docs

I wrote:

I experimented with

SELECT '
... multiline json value ...
' AS json
\gexec

but that didn't seem to work either. Anybody have a better idea?

Oh, never mind, \gset is what I was reaching for. We can make
it work with that.

regards, tom lane

#22Erik Wienhold
ewie@ewie.name
In reply to: Tom Lane (#20)
Re: Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2024-01-19 22:15 +0100, Tom Lane wrote:

"David E. Wheeler" <david@justatheory.com> writes:

[ v7-0001-Improve-boolean-predicate-JSON-Path-docs.patch ]

+ \set json '{
"track": {
"segments": [
{

I find the textual change rather unwieldy, but the bigger problem is
that this example doesn't actually work. If you try to copy-and-paste
this into psql, you get "unterminated quoted string", because psql
metacommands can't span line boundaries.

Interesting... copy-pasting the entire \set command works for me with
psql 16.1 in gnome-terminal and tmux. Typing it out manually gives me
the "unterminated quoted string" error. Maybe has to do with my stty
settings.

I experimented with

SELECT '
... multiline json value ...
' AS json
\gexec

but that didn't seem to work either. Anybody have a better idea?

Fine with me (the \gset variant).

--
Erik

#23David E. Wheeler
david@justatheory.com
In reply to: Erik Wienhold (#22)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Jan 19, 2024, at 21:46, Erik Wienhold <ewie@ewie.name> wrote:

Interesting... copy-pasting the entire \set command works for me with
psql 16.1 in gnome-terminal and tmux. Typing it out manually gives me
the "unterminated quoted string" error. Maybe has to do with my stty
settings.

Yes, same on macOS Terminal.app and 16.1 compiled with readline. I didn’t realize that \set didn’t support newlines, because it works fine when you paste something with newlines. Curious.

I experimented with

SELECT '
... multiline json value ...
' AS json
\gexec

but that didn't seem to work either. Anybody have a better idea?

Fine with me (the \gset variant).

Much cleaner TBH.

david=# select '{
"track": {
"segments": [
{
"location": [ 47.763, 13.4034 ],
"start time": "2018-10-14 10:05:14",
"HR": 73
},
{
"location": [ 47.706, 13.2635 ],
"start time": "2018-10-14 10:39:21",
"HR": 135
}
]
}
}'::jsonb as json;
json --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
{"track": {"segments": [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]}}
(1 row)

david=# \gset

david=# select :'json'::jsonb;
jsonb --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
{"track": {"segments": [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]}}
(1 row)

So great!

While you’re in there, Tom, would it make sense to fold in something like [this patch][1]/messages/by-id/0ECE6B9C-CDDE-4B65-BE5A-49D7372046AF@justatheory.com I posted last month to clarify which JSONPath comparison operators can take advantage of a index?

--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -513,7 +513,7 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] == "qui"';
</programlisting>
    For these operators, a GIN index extracts clauses of the form
    <literal><replaceable>accessors_chain</replaceable>
-    = <replaceable>constant</replaceable></literal> out of
+    == <replaceable>constant</replaceable></literal> out of
    the <type>jsonpath</type> pattern, and does the index search based on
    the keys and values mentioned in these clauses.  The accessors chain
    may include <literal>.<replaceable>key</replaceable></literal>,
@@ -522,6 +522,9 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] == "qui"';
    The <literal>jsonb_ops</literal> operator class also
    supports <literal>.*</literal> and <literal>.**</literal> accessors,
    but the <literal>jsonb_path_ops</literal> operator class does not.
+    Only the <literal>==</literal> and <literal>!=</literal> <link
+    linkend="functions-sqljson-path-operators">SQL/JSON Path Operators</link>
+    can use the index.
  </para>

<para>

Best,

David

[1]: /messages/by-id/0ECE6B9C-CDDE-4B65-BE5A-49D7372046AF@justatheory.com

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#23)
Re: Patch: Improve Boolean Predicate JSON Path Docs

"David E. Wheeler" <david@justatheory.com> writes:

While you’re in there, Tom, would it make sense to fold in something like [this patch][1] I posted last month to clarify which JSONPath comparison operators can take advantage of a index?

--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -513,7 +513,7 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] == "qui"';
</programlisting>
For these operators, a GIN index extracts clauses of the form
<literal><replaceable>accessors_chain</replaceable>
-    = <replaceable>constant</replaceable></literal> out of
+    == <replaceable>constant</replaceable></literal> out of
the <type>jsonpath</type> pattern, and does the index search based on
the keys and values mentioned in these clauses.  The accessors chain
may include <literal>.<replaceable>key</replaceable></literal>,

Right, clearly a typo.

@@ -522,6 +522,9 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] == "qui"';
The <literal>jsonb_ops</literal> operator class also
supports <literal>.*</literal> and <literal>.**</literal> accessors,
but the <literal>jsonb_path_ops</literal> operator class does not.
+    Only the <literal>==</literal> and <literal>!=</literal> <link
+    linkend="functions-sqljson-path-operators">SQL/JSON Path Operators</link>
+    can use the index.
</para>

You sure about that? It would surprise me if we could effectively use
a not-equal condition with an index. If it is only == that works,
then the preceding statement seems sufficient.

regards, tom lane

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#24)
Re: Patch: Improve Boolean Predicate JSON Path Docs

So, overall reaction to this patch: I like the approach of defining
"predicate check expressions" as being a different thing from standard
jsonpath expressions. However, I'm not so thrilled with just saying
"don't use" one type or the other with different jsonpath functions.
According to my tests, some of these functions seem to give sensible
results anyway with the path type you say not to use, while some
give less-sensible results, and others give errors. We ought to try
to document that, and maybe even clean up the less sane behaviors.
(That is, I don't feel that a docs-only patch is necessarily the
thing to do here.)

As an example, @? seems to behave sanely with a standard jsonpath:

regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ < 5)' ;
?column?
----------
t
(1 row)
regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 5)' ;
?column?
----------
f
(1 row)

It will take a predicate, but seems to always return true:

regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] < 5' ;
?column?
----------
t
(1 row)

regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] > 5' ;
?column?
----------
t
(1 row)

Surely we're not helping anybody by leaving that behavior in place.
Making it do something useful, throwing an error, or returning NULL
all seem superior to this. I observe that @@ returns NULL for the
path type it doesn't like, so maybe that's what to do here.

(Unsurprisingly, jsonb_path_exists acts similarly.)

BTW, jsonb_path_query_array and jsonb_path_query_first seem to
take both types of path, like jsonb_path_query, so ISTM they need
docs changes too.

regards, tom lane

#26David E. Wheeler
david@justatheory.com
In reply to: Tom Lane (#25)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Jan 20, 2024, at 12:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Surely we're not helping anybody by leaving that behavior in place.
Making it do something useful, throwing an error, or returning NULL
all seem superior to this. I observe that @@ returns NULL for the
path type it doesn't like, so maybe that's what to do here.

I agree it would be far better for the behavior to be consistent, but frankly would like to see them raise an error. Ideally the hit would suggest the proper alternative operator or function to use, and maybe link to the docs that describe the difference between SQL-standard JSONPath and "predicate check expressions”, and how they have separate operators and functions.

I think of them as practically different data types (and wish they were, TBH). It makes sense that passing a JSON containment expression[1]https://www.postgresql.org/docs/current/datatype-json.html#JSON-CONTAINMENT would raise an error; so should passing the wrong flavor of JSONPath.

BTW, jsonb_path_query_array and jsonb_path_query_first seem to
take both types of path, like jsonb_path_query, so ISTM they need
docs changes too.

Happy to update the patch, either to add those docs or, if we change the behavior to return a NULL or raise an error, then with that information, instead.

Best,

David

[1]: https://www.postgresql.org/docs/current/datatype-json.html#JSON-CONTAINMENT

#27David E. Wheeler
david@justatheory.com
In reply to: Tom Lane (#24)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Jan 20, 2024, at 11:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:

You sure about that? It would surprise me if we could effectively use
a not-equal condition with an index. If it is only == that works,
then the preceding statement seems sufficient.

I’m not! I just assumed it in the same way creating an SQL = operator automatically respects NOT syntax (or so I recall). In fiddling a bit, I can’t get it to use an index:

CREATE TABLE MOVIES (id SERIAL PRIMARY KEY, movie JSONB NOT NULL);
\copy movies(movie) from PROGRAM 'curl -s https://raw.githubusercontent.com/prust/wikipedia-movie-data/master/movies.json | jq -c ".[]" | sed "s|\\\\|\\\\\\\\|g"';
create index on movies using gin (movie);
analyze movies;

david=# explain analyze select id from movies where movie @? '$ ?(@.genre[*] != "Teen")';
QUERY PLAN -----------------------------------------------------------------------------------------------------
Seq Scan on movies (cost=0.00..3741.41 rows=4 width=4) (actual time=19.222..19.223 rows=0 loops=1)
Filter: (movie @? '$?(@."genre"[*] != "Teen")'::jsonpath)
Rows Removed by Filter: 36273
Planning Time: 1.242 ms
Execution Time: 19.247 ms
(5 rows)

But that might be because the planner knows that the query is going to fetch most records, anyway. If I set most records to a single value:

david=# update movies set movie = jsonb_set(movie, '{year}', '2020'::jsonb) where id < 3600;
UPDATE 3599
david=# analyze movies;
ANALYZE
david=# explain analyze select id from movies where movie @? '$ ?(@.year != 2020)';
QUERY PLAN ------------------------------------------------------------------------------------------------------------
Seq Scan on movies (cost=0.00..3884.41 rows=32609 width=4) (actual time=0.065..43.730 rows=32399 loops=1)
Filter: (movie @? '$?(@."year" != 2020)'::jsonpath)
Rows Removed by Filter: 3874
Planning Time: 1.759 ms
Execution Time: 45.368 ms
(5 rows)

Looks like it still doesn’t use the index with !=. Pity.

Best,

David

#28David E. Wheeler
david@justatheory.com
In reply to: Tom Lane (#25)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Jan 20, 2024, at 12:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It will take a predicate, but seems to always return true:

regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] < 5' ;
?column?
----------
t
(1 row)

regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] > 5' ;
?column?
----------
t
(1 row)

Just for the sake of clarity, this return value is “correct,” because @? and other functions and operators that expect SQL standard statements evaluate the SET returned by the JSONPath statement, but predicate check expressions don’t return a set, but a always a single scalar value (true, false, or null). From the POV of the code expecting SQL standard JSONPath results, that’s a set of one. @? sees that the set is not empty so returns true.

Best,

David

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#26)
1 attachment(s)
Re: Patch: Improve Boolean Predicate JSON Path Docs

"David E. Wheeler" <david@justatheory.com> writes:

On Jan 20, 2024, at 12:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Surely we're not helping anybody by leaving that behavior in place.
Making it do something useful, throwing an error, or returning NULL
all seem superior to this. I observe that @@ returns NULL for the
path type it doesn't like, so maybe that's what to do here.

I agree it would be far better for the behavior to be consistent, but frankly would like to see them raise an error. Ideally the hit would suggest the proper alternative operator or function to use, and maybe link to the docs that describe the difference between SQL-standard JSONPath and "predicate check expressions”, and how they have separate operators and functions.

That ship's probably sailed. However, I spent some time poking into
the odd behavior I showed for @?, and it seems to me that it's an
oversight in appendBoolResult. That just automatically returns jperOk
in the !found short-circuit path for any boolean result, which is not
the behavior you'd get if the boolean value were actually returned
(cf. jsonb_path_match_internal). I experimented with making it do
what seems like the right thing, and found that there is only one
regression test case that changes behavior:

 select jsonb '2' @? '$ == "2"';
  ?column? 
 ----------
- t
+ f
 (1 row)

Now, JSON does not think that numeric 2 equals string "2", so
ISTM the expected output here is flat wrong. It's certainly
inconsistent with @@:

regression=# select jsonb '2' @@ '$ == "2"';
?column?
----------

(1 row)

So I think we should consider a patch like the attached
(probably with some more test cases added). I don't really
understand this code however, so maybe I missed something.

regards, tom lane

Attachments:

jsonpath-exists-semantics-fix-wip.patchtext/x-diff; charset=us-ascii; name=jsonpath-exists-semantics-fix-wip.patchDownload
diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c
index ac16f5c85d..63c46cfab5 100644
--- a/src/backend/utils/adt/jsonpath_exec.c
+++ b/src/backend/utils/adt/jsonpath_exec.c
@@ -2065,7 +2065,14 @@ appendBoolResult(JsonPathExecContext *cxt, JsonPathItem *jsp,
 	JsonbValue	jbv;
 
 	if (!jspGetNext(jsp, &next) && !found)
-		return jperOk;			/* found singleton boolean value */
+	{
+		/*
+		 * We have a predicate check expression, i.e. a path ending in a bare
+		 * boolean operator, and we don't need to return the exact value(s)
+		 * found.  Just report success or failure of the boolean.
+		 */
+		return (res == jpbTrue) ? jperOk : jperNotFound;
+	}
 
 	if (res == jpbUnknown)
 	{
diff --git a/src/test/regress/expected/jsonb_jsonpath.out b/src/test/regress/expected/jsonb_jsonpath.out
index 6659bc9091..59e1b71051 100644
--- a/src/test/regress/expected/jsonb_jsonpath.out
+++ b/src/test/regress/expected/jsonb_jsonpath.out
@@ -1157,7 +1157,7 @@ select jsonb_path_query('2', '$ == "2"');
 select jsonb '2' @? '$ == "2"';
  ?column? 
 ----------
- t
+ f
 (1 row)
 
 select jsonb '2' @@ '$ > 1';
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#28)
Re: Patch: Improve Boolean Predicate JSON Path Docs

"David E. Wheeler" <david@justatheory.com> writes:

On Jan 20, 2024, at 12:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It will take a predicate, but seems to always return true:

regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] < 5' ;
?column?
----------
t
(1 row)

regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] > 5' ;
?column?
----------
t
(1 row)

Just for the sake of clarity, this return value is “correct,” because @? and other functions and operators that expect SQL standard statements evaluate the SET returned by the JSONPath statement, but predicate check expressions don’t return a set, but a always a single scalar value (true, false, or null). From the POV of the code expecting SQL standard JSONPath results, that’s a set of one. @? sees that the set is not empty so returns true.

I don't entirely buy this argument --- if that is the interpretation,
of what use are predicate check expressions? It seems to me that we
have to consider them as being a shorthand notation for filter
expressions, or else they simply do not make sense as jsonpath.

regards, tom lane

#31David E. Wheeler
david@justatheory.com
In reply to: Tom Lane (#30)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Jan 21, 2024, at 14:43, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't entirely buy this argument --- if that is the interpretation,
of what use are predicate check expressions? It seems to me that we
have to consider them as being a shorthand notation for filter
expressions, or else they simply do not make sense as jsonpath.

I believe it becomes pretty apparent when using jsonb_path_query(). The filter expression returns a set (using the previous \gset example):

david=# select jsonb_path_query(:'json', '$.track.segments[*].HR ? (@ > 10)');
jsonb_path_query
------------------
73
135
(2 rows)

The predicate check returns a boolean:

david=# select jsonb_path_query(:'json', '$.track.segments[*].HR > 10');
jsonb_path_query
------------------
true
(1 row)

This is the only way the different behaviors make sense to me. @? expects a set, not a boolean, sees there is an item in the set, so returns true:

david=# select jsonb_path_query(:'json', '$.track.segments[*].HR > 1000');
jsonb_path_query
------------------
false
(1 row)

david=# select :'json'::jsonb @? '$.track.segments[*].HR > 1000';
?column?
----------
t
(1 row)

Best,

David

#32David E. Wheeler
david@justatheory.com
In reply to: David E. Wheeler (#31)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Jan 21, 2024, at 14:52, David E. Wheeler <david@justatheory.com> wrote:

This is the only way the different behaviors make sense to me. @? expects a set, not a boolean, sees there is an item in the set, so returns true:

I make this interpretation based on this bit of the docs:

<para>
<productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
language has the following deviations from the SQL/JSON standard.
</para>

<sect4 id="boolean-predicate-check-expressions">
<title>Boolean Predicate Check Expressions</title>
<para>
As an extension to the SQL standard, a <productname>PostgreSQL</productname>
path expression can be a Boolean predicate, whereas the SQL standard allows
predicates only in filters. Where SQL standard path expressions return the
relevant contents of the queried JSON value, predicate check expressions
return the three-valued result of the predicate: <literal>true</literal>,
<literal>false</literal>, or <literal>unknown</literal>. Compare this
filter <type>jsonpath</type> expression:
<screen>
<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments ?(@[*].HR &gt; 130)');</userinput>
jsonb_path_query
---------------------------------------------------------------------------------
{"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}
</screen>
To a predicate expression, which returns <literal>true</literal>
<screen>
<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR &gt; 130');</userinput>
jsonb_path_query
------------------
true
</screen>
</para>

Best,

David

#33David E. Wheeler
david@justatheory.com
In reply to: David E. Wheeler (#32)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Jan 21, 2024, at 14:58, David E. Wheeler <david@justatheory.com> wrote:

I make this interpretation based on this bit of the docs:

Sorry, that’s from my branch. Here it is in master:

<listitem>
<para>
A path expression can be a Boolean predicate, although the SQL/JSON
standard allows predicates only in filters. This is necessary for
implementation of the <literal>@@</literal> operator. For example,
the following <type>jsonpath</type> expression is valid in
<productname>PostgreSQL</productname>:
<programlisting>
$.track.segments[*].HR &lt; 70
</programlisting>
</para>
</listitem>

In any event, something to do with @@, perhaps to have some compatibility with `jsonb @> jsonb`? I don’t know why @@ was important to have.

David

#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#33)
1 attachment(s)
Re: Patch: Improve Boolean Predicate JSON Path Docs

"David E. Wheeler" <david@justatheory.com> writes:

In any event, something to do with @@, perhaps to have some compatibility with `jsonb @> jsonb`? I don’t know why @@ was important to have.

Yeah, that's certainly under-explained. But it seems like I'm not
getting traction for the idea of changing the behavior, so let's
go back to just documenting it. I spent some time going over your
text and also cleaning up nearby shaky English, and ended with v8
attached. I'd be content to commit this if it looks good to you.

regards, tom lane

Attachments:

v8-0001-Improve-boolean-predicate-JSON-Path-docs.patchtext/x-diff; charset=us-ascii; name=v8-0001-Improve-boolean-predicate-JSON-Path-docs.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5030a1045f..99616d2298 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15889,6 +15889,9 @@ table2-mapping
        </para>
        <para>
         Does JSON path return any item for the specified JSON value?
+        (This is useful only with SQL-standard JSON path expressions, not
+        <link linkend="functions-sqljson-check-expressions">predicate check
+        expressions</link>, since those always return a value.)
        </para>
        <para>
         <literal>'{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 2)'</literal>
@@ -15903,9 +15906,12 @@ table2-mapping
        </para>
        <para>
         Returns the result of a JSON path predicate check for the
-        specified JSON value.  Only the first item of the result is taken into
-        account.  If the result is not Boolean, then <literal>NULL</literal>
-        is returned.
+        specified JSON value.
+        (This is useful only
+        with <link linkend="functions-sqljson-check-expressions">predicate
+        check expressions</link>, not SQL-standard JSON path expressions,
+        since it will return <literal>NULL</literal> if the path result is
+        not a single boolean value.)
        </para>
        <para>
         <literal>'{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2'</literal>
@@ -17029,6 +17035,9 @@ ERROR:  value too long for type character(2)
        <para>
         Checks whether the JSON path returns any item for the specified JSON
         value.
+        (This is useful only with SQL-standard JSON path expressions, not
+        <link linkend="functions-sqljson-check-expressions">predicate check
+        expressions</link>, since those always return a value.)
         If the <parameter>vars</parameter> argument is specified, it must
         be a JSON object, and its fields provide named values to be
         substituted into the <type>jsonpath</type> expression.
@@ -17052,8 +17061,12 @@ ERROR:  value too long for type character(2)
        </para>
        <para>
         Returns the result of a JSON path predicate check for the specified
-        JSON value.  Only the first item of the result is taken into account.
-        If the result is not Boolean, then <literal>NULL</literal> is returned.
+        JSON value.
+        (This is useful only
+        with <link linkend="functions-sqljson-check-expressions">predicate
+        check expressions</link>, not SQL-standard JSON path expressions,
+        since it will either fail or return <literal>NULL</literal> if the
+        path result is not a single boolean value.)
         The optional <parameter>vars</parameter>
         and <parameter>silent</parameter> arguments act the same as
         for <function>jsonb_path_exists</function>.
@@ -17075,6 +17088,12 @@ ERROR:  value too long for type character(2)
        <para>
         Returns all JSON items returned by the JSON path for the specified
         JSON value.
+        For SQL-standard JSON path expressions it returns the JSON
+        values selected from <parameter>target</parameter>.
+        For <link linkend="functions-sqljson-check-expressions">predicate
+        check expressions</link> it returns the result of the predicate
+        check: <literal>true</literal>, <literal>false</literal>,
+        or <literal>null</literal>.
         The optional <parameter>vars</parameter>
         and <parameter>silent</parameter> arguments act the same as
         for <function>jsonb_path_exists</function>.
@@ -17103,9 +17122,8 @@ ERROR:  value too long for type character(2)
        <para>
         Returns all JSON items returned by the JSON path for the specified
         JSON value, as a JSON array.
-        The optional <parameter>vars</parameter>
-        and <parameter>silent</parameter> arguments act the same as
-        for <function>jsonb_path_exists</function>.
+        The parameters are the same as
+        for <function>jsonb_path_query</function>.
        </para>
        <para>
         <literal>jsonb_path_query_array('{"a":[1,2,3,4,5]}', '$.a[*] ? (@ >= $min &amp;&amp; @ &lt;= $max)', '{"min":2, "max":4}')</literal>
@@ -17123,11 +17141,10 @@ ERROR:  value too long for type character(2)
        </para>
        <para>
         Returns the first JSON item returned by the JSON path for the
-        specified JSON value.  Returns <literal>NULL</literal> if there are no
+        specified JSON value, or <literal>NULL</literal> if there are no
         results.
-        The optional <parameter>vars</parameter>
-        and <parameter>silent</parameter> arguments act the same as
-        for <function>jsonb_path_exists</function>.
+        The parameters are the same as
+        for <function>jsonb_path_query</function>.
        </para>
        <para>
         <literal>jsonb_path_query_first('{"a":[1,2,3,4,5]}', '$.a[*] ? (@ >= $min &amp;&amp; @ &lt;= $max)', '{"min":2, "max":4}')</literal>
@@ -17266,9 +17283,9 @@ ERROR:  value too long for type character(2)
   </indexterm>
 
   <para>
-   SQL/JSON path expressions specify the items to be retrieved
-   from the JSON data, similar to XPath expressions used
-   for SQL access to XML. In <productname>PostgreSQL</productname>,
+   SQL/JSON path expressions specify item(s) to be retrieved
+   from a JSON value, similarly to XPath expressions used
+   for access to XML content. In <productname>PostgreSQL</productname>,
    path expressions are implemented as the <type>jsonpath</type>
    data type and can use any elements described in
    <xref linkend="datatype-jsonpath"/>.
@@ -17279,6 +17296,8 @@ ERROR:  value too long for type character(2)
    pass the provided path expression to the <firstterm>path engine</firstterm>
    for evaluation. If the expression matches the queried JSON data,
    the corresponding JSON item, or set of items, is returned.
+   If there is no match, the result will be <literal>NULL</literal>,
+   <literal>false</literal>, or an error, depending on the function.
    Path expressions are written in the SQL/JSON path language
    and can include arithmetic expressions and functions.
   </para>
@@ -17296,18 +17315,20 @@ ERROR:  value too long for type character(2)
   <para>
    To refer to the JSON value being queried (the
    <firstterm>context item</firstterm>), use the <literal>$</literal> variable
-   in the path expression. It can be followed by one or more
+   in the path expression. The first element of a path must always
+   be <literal>$</literal>. It can be followed by one or more
    <link linkend="type-jsonpath-accessors">accessor operators</link>,
    which go down the JSON structure level by level to retrieve sub-items
-   of the context item. Each operator that follows deals with the
-   result of the previous evaluation step.
+   of the context item. Each accessor operator acts on the
+   result(s) of the previous evaluation step, producing zero, one, or more
+   output items from each input item.
   </para>
 
   <para>
    For example, suppose you have some JSON data from a GPS tracker that you
    would like to parse, such as:
 <programlisting>
-{
+SELECT '{
   "track": {
     "segments": [
       {
@@ -17322,68 +17343,100 @@ ERROR:  value too long for type character(2)
       }
     ]
   }
-}
+}' AS json \gset
 </programlisting>
+   (The above example can be copied-and-pasted
+   into <application>psql</application> to set things up for the following
+   examples.  Then <application>psql</application> will
+   expand <literal>:'json'</literal> into a suitably-quoted string
+   constant containing the JSON value.)
   </para>
 
   <para>
    To retrieve the available track segments, you need to use the
    <literal>.<replaceable>key</replaceable></literal> accessor
-   operator to descend through surrounding JSON objects:
-<programlisting>
-$.track.segments
-</programlisting>
+   operator to descend through surrounding JSON objects, for example:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
+                                                                         jsonb_path_query
+-----------------------------------------------------------&zwsp;-----------------------------------------------------------&zwsp;---------------------------------------------
+ [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
+</screen>
   </para>
 
   <para>
    To retrieve the contents of an array, you typically use the
-   <literal>[*]</literal> operator. For example,
-   the following path will return the location coordinates for all
+   <literal>[*]</literal> operator.
+   The following example will return the location coordinates for all
    the available track segments:
-<programlisting>
-$.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
+   Here we started with the whole JSON input value (<literal>$</literal>),
+   then the <literal>.track</literal> accessor selected the JSON object
+   associated with the <literal>"track"</literal> object key, then
+   the <literal>.segments</literal> accessor selected the JSON array
+   associated with the <literal>"segments"</literal> key within that
+   object, then the <literal>[*]</literal> accessor selected each element
+   of that array (producing a series of items), then
+   the <literal>.location</literal> accessor selected the JSON array
+   associated with the <literal>"location"</literal> key within each of
+   those objects.  In this example, each of those objects had
+   a <literal>"location"</literal> key; but if any of them did not,
+   the <literal>.location</literal> accessor would have simply produced no
+   output for that input item.
   </para>
 
   <para>
    To return the coordinates of the first segment only, you can
    specify the corresponding subscript in the <literal>[]</literal>
    accessor operator. Recall that JSON array indexes are 0-relative:
-<programlisting>
-$.track.segments[0].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[0].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+</screen>
   </para>
 
   <para>
    The result of each path evaluation step can be processed
-   by one or more <type>jsonpath</type> operators and methods
+   by one or more of the <type>jsonpath</type> operators and methods
    listed in <xref linkend="functions-sqljson-path-operators"/>.
    Each method name must be preceded by a dot. For example,
    you can get the size of an array:
-<programlisting>
-$.track.segments.size()
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
    More examples of using <type>jsonpath</type> operators
    and methods within path expressions appear below in
    <xref linkend="functions-sqljson-path-operators"/>.
   </para>
 
   <para>
-   When defining a path, you can also use one or more
+   A path can also contain
    <firstterm>filter expressions</firstterm> that work similarly to the
    <literal>WHERE</literal> clause in SQL. A filter expression begins with
    a question mark and provides a condition in parentheses:
 
-<programlisting>
+<synopsis>
 ? (<replaceable>condition</replaceable>)
-</programlisting>
+</synopsis>
   </para>
 
   <para>
    Filter expressions must be written just after the path evaluation step
    to which they should apply. The result of that step is filtered to include
    only those items that satisfy the provided condition. SQL/JSON defines
-   three-valued logic, so the condition can be <literal>true</literal>, <literal>false</literal>,
+   three-valued logic, so the condition can
+   produce <literal>true</literal>, <literal>false</literal>,
    or <literal>unknown</literal>. The <literal>unknown</literal> value
    plays the same role as SQL <literal>NULL</literal> and can be tested
    for with the <literal>is unknown</literal> predicate. Further path
@@ -17395,90 +17448,133 @@ $.track.segments.size()
    The functions and operators that can be used in filter expressions are
    listed in <xref linkend="functions-sqljson-filter-ex-table"/>.  Within a
    filter expression, the <literal>@</literal> variable denotes the value
-   being filtered (i.e., one result of the preceding path step).  You can
+   being considered (i.e., one result of the preceding path step).  You can
    write accessor operators after <literal>@</literal> to retrieve component
    items.
   </para>
 
   <para>
    For example, suppose you would like to retrieve all heart rate values higher
-   than 130. You can achieve this using the following expression:
-<programlisting>
-$.track.segments[*].HR ? (@ &gt; 130)
-</programlisting>
+   than 130. You can achieve this as follows:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
    To get the start times of segments with such values, you have to
-   filter out irrelevant segments before returning the start times, so the
+   filter out irrelevant segments before selecting the start times, so the
    filter expression is applied to the previous step, and the path used
    in the condition is different:
-<programlisting>
-$.track.segments[*] ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
-   You can use several filter expressions in sequence, if required. For
-   example, the following expression selects start times of all segments that
+   You can use several filter expressions in sequence, if required.
+   The following example selects start times of all segments that
    contain locations with relevant coordinates and high heart rate values:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"');</userinput>
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
+</screen>
   </para>
 
   <para>
    Using filter expressions at different nesting levels is also allowed.
    The following example first filters all segments by location, and then
    returns high heart rate values for these segments, if available:
-<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)');</userinput>
+ jsonb_path_query
+------------------
+ 135
+</screen>
   </para>
 
   <para>
-   You can also nest filter expressions within each other:
-<programlisting>
-$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()
-</programlisting>
-   This expression returns the size of the track if it contains any
-   segments with high heart rate values, or an empty sequence otherwise.
+   You can also nest filter expressions within each other.
+   This example returns the size of the track if it contains any
+   segments with high heart rate values, or an empty sequence otherwise:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2
+</screen>
   </para>
 
-  <para>
-   <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
-   language has the following deviations from the SQL/JSON standard:
-  </para>
+  <sect3 id="functions-sqljson-deviations">
+  <title>Deviations from the SQL Standard</title>
+   <para>
+    <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
+    language has the following deviations from the SQL/JSON standard.
+   </para>
 
-  <itemizedlist>
-   <listitem>
+   <sect4 id="functions-sqljson-check-expressions">
+   <title>Boolean Predicate Check Expressions</title>
     <para>
-     A path expression can be a Boolean predicate, although the SQL/JSON
-     standard allows predicates only in filters.  This is necessary for
-     implementation of the <literal>@@</literal> operator. For example,
-     the following <type>jsonpath</type> expression is valid in
-     <productname>PostgreSQL</productname>:
-<programlisting>
-$.track.segments[*].HR &lt; 70
-</programlisting>
-    </para>
-   </listitem>
+     As an extension to the SQL standard,
+     a <productname>PostgreSQL</productname> path expression can be a
+     Boolean predicate, whereas the SQL standard allows predicates only within
+     filters. While SQL-standard path expressions return the relevant
+     element(s) of the queried JSON value, predicate check expressions
+     return the single three-valued result of the
+     predicate: <literal>true</literal>,
+     <literal>false</literal>, or <literal>unknown</literal>.
+     For example, we could write this SQL-standard filter expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments ?(@[*].HR &gt; 130)');</userinput>
+                                jsonb_path_query
+-----------------------------------------------------------&zwsp;----------------------
+ {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}
+</screen>
+     The similar predicate check expression simply
+     returns <literal>true</literal>, indicating that a match exists:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments[*].HR &gt; 130');</userinput>
+ jsonb_path_query
+------------------
+ true
+</screen>
+     </para>
 
-   <listitem>
-    <para>
-     There are minor differences in the interpretation of regular
-     expression patterns used in <literal>like_regex</literal> filters, as
-     described in <xref linkend="jsonpath-regular-expressions"/>.
-    </para>
-   </listitem>
-  </itemizedlist>
+     <note>
+      <para>
+       Predicate check expressions are required in the
+       <literal>@@</literal> operator (and the
+       <function>jsonb_path_match</function> function), and should not be used
+       with the <literal>@?</literal> operator (or the
+       <function>jsonb_path_exists</function> function).
+      </para>
+     </note>
+    </sect4>
 
-   <sect3 id="strict-and-lax-modes">
+    <sect4 id="functions-sqljson-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>
+   </sect3>
+
+   <sect3 id="functions-sqljson-strict-and-lax-modes">
    <title>Strict and Lax Modes</title>
     <para>
      When you query JSON data, the path expression may not match the
      actual JSON data structure. An attempt to access a non-existent
-     member of an object or element of an array results in a
+     member of an object or element of an array is defined as a
      structural error. SQL/JSON path expressions have two modes
      of handling structural errors:
     </para>
@@ -17488,8 +17584,8 @@ $.track.segments[*].HR &lt; 70
      <para>
       lax (default) &mdash; the path engine implicitly adapts
       the queried data to the specified path.
-      Any remaining structural errors are suppressed and converted
-      to empty SQL/JSON sequences.
+      Any structural errors that cannot be fixed as described below
+      are suppressed, producing no match.
      </para>
     </listitem>
     <listitem>
@@ -17500,15 +17596,15 @@ $.track.segments[*].HR &lt; 70
    </itemizedlist>
 
    <para>
-    The lax mode facilitates matching of a JSON document structure and path
-    expression if the JSON data does not conform to the expected schema.
+    The lax mode facilitates matching of a JSON document and path
+    expression when the JSON data does not conform to the expected schema.
     If an operand does not match the requirements of a particular operation,
-    it can be automatically wrapped as an SQL/JSON array or unwrapped by
+    it can be automatically wrapped as an SQL/JSON array, or unwrapped by
     converting its elements into an SQL/JSON sequence before performing
-    this operation. Besides, comparison operators automatically unwrap their
+    the operation. Also, comparison operators automatically unwrap their
     operands in the lax mode, so you can compare SQL/JSON arrays
     out-of-the-box. An array of size 1 is considered equal to its sole element.
-    Automatic unwrapping is not performed only when:
+    Automatic unwrapping is not performed when:
     <itemizedlist>
      <listitem>
       <para>
@@ -17532,40 +17628,95 @@ $.track.segments[*].HR &lt; 70
     For example, when querying the GPS data listed above, you can
     abstract from the fact that it stores an array of segments
     when using the lax mode:
-<programlisting>
-lax $.track.segments.location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments.location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
    <para>
     In the strict mode, the specified path must exactly match the structure of
-    the queried JSON document to return an SQL/JSON item, so using this
-    path expression will cause an error. To get the same result as in
-    the lax mode, you have to explicitly unwrap the
+    the queried JSON document, so using this path
+    expression will cause an error:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.location');</userinput>
+ERROR:  jsonpath member accessor can only be applied to an object
+</screen>
+    To get the same result as in the lax mode, you have to explicitly unwrap the
     <literal>segments</literal> array:
-<programlisting>
-strict $.track.segments[*].location
-</programlisting>
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+</screen>
    </para>
 
    <para>
-    The <literal>.**</literal> accessor can lead to surprising results
-    when using the lax mode. For instance, the following query selects every
-    <literal>HR</literal> value twice:
-<programlisting>
-lax $.**.HR
-</programlisting>
+    The unwrapping behavior of lax mode can lead to surprising results. For
+    instance, the following query using the <literal>.**</literal> accessor
+    selects every <literal>HR</literal> value twice:
+ <screen>
+ <prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.**.HR');</userinput>
+  jsonb_path_query
+ ------------------
+  73
+  135
+  73
+  135
+ </screen>
     This happens because the <literal>.**</literal> accessor selects both
     the <literal>segments</literal> array and each of its elements, while
     the <literal>.HR</literal> accessor automatically unwraps arrays when
     using the lax mode. To avoid surprising results, we recommend using
     the <literal>.**</literal> accessor only in the strict mode. The
     following query selects each <literal>HR</literal> value just once:
-<programlisting>
-strict $.**.HR
-</programlisting>
+ <screen>
+ <prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.**.HR');</userinput>
+  jsonb_path_query
+ ------------------
+  73
+  135
+ </screen>
    </para>
 
+   <para>
+    The unwrapping of arrays can also lead to unexpected results. Consider this
+    example, which selects all the <literal>location</literal> arrays:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments[*].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+(2 rows)
+</screen>
+    As expected it returns the full arrays. But applying a filter expression
+    causes the arrays to be unwrapped to evaluate each item, returning only the
+    items that match the expression:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'lax $.track.segments[*].location ?(@[*] &gt; 15)');</userinput>
+ jsonb_path_query
+------------------
+ 47.763
+ 47.706
+(2 rows)
+</screen>
+    This despite the fact that the full arrays are selected by the path
+    expression. Use strict mode to restore selecting the arrays:
+<screen>
+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[*].location ?(@[*] &gt; 15)');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
+(2 rows)
+</screen>
+   </para>
    </sect3>
 
    <sect3 id="functions-sqljson-path-operators">
@@ -18141,7 +18292,7 @@ strict $.**.HR
         Tests whether a path expression matches at least one SQL/JSON item.
         Returns <literal>unknown</literal> if the path expression would result
         in an error; the second example uses this to avoid a no-such-key error
-        in strict mode.
+        in the strict mode.
        </para>
        <para>
         <literal>jsonb_path_query('{"x": [1, 2], "y": [2, 4]}', 'strict $.* ? (exists (@ ? (@[*] &gt; 2)))')</literal>
diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index b6c2ddbf55..1dbb9606e9 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -513,7 +513,7 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] == "qui"';
 </programlisting>
     For these operators, a GIN index extracts clauses of the form
     <literal><replaceable>accessors_chain</replaceable>
-    = <replaceable>constant</replaceable></literal> out of
+    == <replaceable>constant</replaceable></literal> out of
     the <type>jsonpath</type> pattern, and does the index search based on
     the keys and values mentioned in these clauses.  The accessors chain
     may include <literal>.<replaceable>key</replaceable></literal>,
#35David E. Wheeler
david@justatheory.com
In reply to: Tom Lane (#34)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Jan 24, 2024, at 16:32, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David E. Wheeler" <david@justatheory.com> writes:

In any event, something to do with @@, perhaps to have some compatibility with `jsonb @> jsonb`? I don’t know why @@ was important to have.

Yeah, that's certainly under-explained. But it seems like I'm not
getting traction for the idea of changing the behavior, so let's
go back to just documenting it.

Curious about those discussions. On the one hand I find the distinction between the two behaviors to be odd, and to produce unexpected results when they’re not used in the proper context.

It’s reminds me of the Perl idea of context, where functions behave differently in scalar and list context, and if you expect list behavior on scalar context you’re gonna get a surprise. This is a bit of a challenge for those new to the language, as they’re not necessarily aware of the context.

I spent some time going over your
text and also cleaning up nearby shaky English, and ended with v8
attached. I'd be content to commit this if it looks good to you.

This looks very nice, thank you. A couple of comments.

+      <para>
+       Predicate check expressions are required in the
+       <literal>@@</literal> operator (and the
+       <function>jsonb_path_match</function> function), and should not be used
+       with the <literal>@?</literal> operator (or the
+       <function>jsonb_path_exists</function> function).
+      </para>
+     </note>
+    </sect4>

I had this bit here:

<para>
Conversely, non-predicate <type>jsonpath</type> expressions should not be
used with the <literal>@@</literal> operator (or the
<function>jsonb_path_match</function> function).
</para>

I think it’s important to let people know what the difference is in the behavior of the two forms, in every spot it’s likely to come up. SQL-standard JSON Path expressions should never be used in contexts (functions, operators) only designed to work with predicate check expressions, and the docs should say so IMO.

<para>
-    The lax mode facilitates matching of a JSON document structure and path
-    expression if the JSON data does not conform to the expected schema.
+    The lax mode facilitates matching of a JSON document and path
+    expression when the JSON data does not conform to the expected schema.

What do you think of also dropping the article from all the references to “the strict mode” or “the lax mode”, to make them “strict mode” and “lax mode”, respectively?

Thanks for the review!

Best,

David

#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#35)
Re: Patch: Improve Boolean Predicate JSON Path Docs

"David E. Wheeler" <david@justatheory.com> writes:

On Jan 24, 2024, at 16:32, Tom Lane <tgl@sss.pgh.pa.us> wrote:

+      <para>
+       Predicate check expressions are required in the
+       <literal>@@</literal> operator (and the
+       <function>jsonb_path_match</function> function), and should not be used
+       with the <literal>@?</literal> operator (or the
+       <function>jsonb_path_exists</function> function).
+      </para>
+     </note>
+    </sect4>

I had this bit here:

<para>
Conversely, non-predicate <type>jsonpath</type> expressions should not be
used with the <literal>@@</literal> operator (or the
<function>jsonb_path_match</function> function).
</para>

I changed the preceding para to say "... check expressions are
required in ...", which I thought was sufficient to cover that.
Also, the tabular description of the operator tells you not to do it.

What do you think of also dropping the article from all the references to “the strict mode” or “the lax mode”, to make them “strict mode” and “lax mode”, respectively?

Certainly most of 'em don't need it. I'll make it so.

regards, tom lane

#37David E. Wheeler
david@justatheory.com
In reply to: Tom Lane (#36)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Jan 25, 2024, at 11:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I changed the preceding para to say "... check expressions are
required in ...", which I thought was sufficient to cover that.
Also, the tabular description of the operator tells you not to do it.

Yeah, that’s good. I was perhaps leaning into being over-explicit after it took me a while to even figure out that there was a difference, let alone where matters.

What do you think of also dropping the article from all the references to “the strict mode” or “the lax mode”, to make them “strict mode” and “lax mode”, respectively?

Certainly most of 'em don't need it. I'll make it so.

Nice, thanks!

David