Patch: Improve Boolean Predicate JSON Path Docs

Started by David E. Wheelerover 2 years ago37 messageshackers
Jump to latest
#1David E. Wheeler
david@kineticode.com

Hackers,

Following up from a suggestion from Tom Lane[1]/messages/by-id/1229727.1680535592@sss.pgh.pa.us to improve the documentation of boolean predicate JSON path expressions, please find enclosed a draft patch to do so. It does three things:

1. Converts all of the example path queries to use jsonb_path_query() and show the results, to make it clearer what the behaviors are.

2. Replaces the list of deviations from the standards with a new subsection, with each deviation in its own sub-subsection. The regex section is unchanged, but I’ve greatly expanded the boolean expression JSON path section with examples comparing standard filter expressions and nonstandard boolean predicates. I’ve also added an exhortation not use boolean expressions with @? or standard path expressions with @@.

3. While converting the modes section to use jsonb_path_query() and show the results, I also added an example of strict mode returning an error.

Follow-ups I’d like to make:

1. Expand the modes section to show how the types of results can vary depending on the mode, thanks to the flattening. Examples:

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a ?(@[*] > 2)');
jsonb_path_query
------------------
3
4
5
(3 rows)

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.a ?(@[*] > 2)');
jsonb_path_query
------------------
[1, 2, 3, 4, 5]

2. Improve the descriptions and examples for @?/jsonb_path_exists() and @@/jsonb_path_match().

Best,

David

[1]: /messages/by-id/1229727.1680535592@sss.pgh.pa.us

Attachments:

jsonpath-pred-docs.patchapplication/octet-stream; name=jsonpath-pred-docs.patch; x-unix-mode=0644Download+119-40
#2David E. Wheeler
david@kineticode.com
In reply to: David E. Wheeler (#1)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 14, 2023, at 16:40, David E. Wheeler <david@justatheory.com> wrote:

Following up from a suggestion from Tom Lane[1] to improve the documentation of boolean predicate JSON path expressions, please find enclosed a draft patch to do so.

And now I see I can’t spell “Deviations”. Will fix along with any other requested revisions. GitHub diff here if you’re into that sort of thing:

https://github.com/postgres/postgres/compare/master...theory:postgres:jsonpath-pred-docs

Best,

David

#3Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#1)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-14 22:40 +0200, David E. Wheeler write:

Following up from a suggestion from Tom Lane[1] to improve the
documentation of boolean predicate JSON path expressions, please find
enclosed a draft patch to do so.

Thanks for putting this together. See my review at the end.

It does three things:

1. Converts all of the example path queries to use jsonb_path_query()
and show the results, to make it clearer what the behaviors are.

Nice. This really does help to make some sense of it. I checked all
queries and they do work out except for two queries where the path
expression string is not properly quoted (but the intended output is
still correct).

2. Replaces the list of deviations from the standards with a new
subsection, with each deviation in its own sub-subsection. The regex
section is unchanged, but I’ve greatly expanded the boolean expression
JSON path section with examples comparing standard filter expressions
and nonstandard boolean predicates. I’ve also added an exhortation not
use boolean expressions with @? or standard path expressions with @@.

LGTM.

3. While converting the modes section to use jsonb_path_query() and
show the results, I also added an example of strict mode returning an
error.

Follow-ups I’d like to make:

1. Expand the modes section to show how the types of results can vary
depending on the mode, thanks to the flattening. Examples:

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a ?(@[*] > 2)');
jsonb_path_query
------------------
3
4
5
(3 rows)

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.a ?(@[*] > 2)');
jsonb_path_query
------------------
[1, 2, 3, 4, 5]

2. Improve the descriptions and examples for @?/jsonb_path_exists()
and @@/jsonb_path_match().

+1

[1] /messages/by-id/1229727.1680535592@sss.pgh.pa.us

My review:

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index affd1254bb..295f8ca5c9 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17205,7 +17205,7 @@ array w/o UK? | t
For example, suppose you have some JSON data from a GPS tracker that you
would like to parse, such as:
<programlisting>
-{
+ \set json '{

Perhaps make it explicit that the reader must run this in psql in order
to use \set and :'json' in the ensuing samples? Some of the existing
examples already use psql output but they do not rely on any psql
features.

"track": {
"segments": [
{
@@ -17220,7 +17220,7 @@ array w/o UK? | t
}
]
}
-}
+}'
</programlisting>
</para>

@@ -17229,7 +17229,10 @@ array w/o UK? | t
<literal>.<replaceable>key</replaceable></literal> accessor
operator to descend through surrounding JSON objects:
<programlisting>
-$.track.segments
+select jsonb_path_query(:'json'::jsonb, '$.track.segments');
+                                                                         jsonb_path_query
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
</programlisting>

This should use <screen>, <userinput>, and <computeroutput> if it shows
a psql session, e.g.:

<screen>
<userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
<computeroutput>
jsonb_path_query
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
</computeroutput>
</screen>

Also the cast to jsonb is not necessary and only adds clutter IMO.

</para>

@@ -17239,7 +17242,11 @@ $.track.segments
the following path will return the location coordinates for all
the available track segments:
<programlisting>
-$.track.segments[*].location
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*].location');
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
</programlisting>
</para>
@@ -17248,7 +17255,10 @@ $.track.segments[*].location
specify the corresponding subscript in the <literal>[]</literal>
accessor operator. Recall that JSON array indexes are 0-relative:
<programlisting>
-$.track.segments[0].location
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments[0].location');
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]
</programlisting>
</para>
@@ -17259,7 +17269,10 @@ $.track.segments[0].location
Each method name must be preceded by a dot. For example,
you can get the size of an array:
<programlisting>
-$.track.segments.size()
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments.size()');
+ jsonb_path_query
+------------------
+ 2
</programlisting>
More examples of using <type>jsonpath</type> operators
and methods within path expressions appear below in
@@ -17302,7 +17315,10 @@ $.track.segments.size()
For example, suppose you would like to retrieve all heart rate values higher
than 130. You can achieve this using the following expression:
<programlisting>
-$.track.segments[*].HR ? (@ &gt; 130)
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*].HR ? (@ &gt; 130)');
+ jsonb_path_query
+------------------
+ 135
</programlisting>
</para>
@@ -17312,7 +17328,10 @@ $.track.segments[*].HR ? (@ &gt; 130)
filter expression is applied to the previous step, and the path used
in the condition is different:
<programlisting>
-$.track.segments[*] ? (@.HR &gt; 130)."start time"
+ select jsonb_path_query(:'json'::jsonb, '$.track.segments[*] ? (@.HR &gt; 130)."start time"');
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
</programlisting>
</para>
@@ -17321,7 +17340,10 @@ $.track.segments[*] ? (@.HR &gt; 130)."start time"
example, the following expression selects start times of all segments that
contain locations with relevant coordinates and high heart rate values:
<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"');
+   jsonb_path_query
+-----------------------
+ "2018-10-14 10:39:21"
</programlisting>
</para>
@@ -17330,46 +17352,81 @@ $.track.segments[*] ? (@.location[1] &lt; 13.4) ? (@.HR &gt; 130)."start time"
The following example first filters all segments by location, and then
returns high heart rate values for these segments, if available:
<programlisting>
-$.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)
+select jsonb_path_query(:'json'::jsonb, $.track.segments[*] ? (@.location[1] &lt; 13.4).HR ? (@ &gt; 130)');

The opening quote is missing from the jsonpath literal.

+ jsonb_path_query
+------------------
+ 135
</programlisting>
</para>
<para>
You can also nest filter expressions within each other:
<programlisting>
-$.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()
+select jsonb_path_query(:'json'::jsonb, $.track ? (exists(@.segments[*] ? (@.HR &gt; 130))).segments.size()');

Missing opening quote here as well.

+ jsonb_path_query
+------------------
+ 2
</programlisting>
This expression returns the size of the track if it contains any
segments with high heart rate values, or an empty sequence otherwise.
</para>

-  <para>
-   <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
-   language has the following deviations from the SQL/JSON standard:
-  </para>
+  <sect3 id="devations-from-the-standard">
+  <title>Devaiations from the SQL Standard</title>

Typo in "deviations" (section ID and title).

+   <para>
+    <productname>PostgreSQL</productname>'s implementation of the SQL/JSON path
+    language has the following deviations from the SQL/JSON standard:

The sentence should and in a period when this para is no longer followed
by an item list.

+ </para>

-  <itemizedlist>
-   <listitem>
+   <sect4 id="boolean-predicate-path-expressions">
+   <title>Boolean Predicate Path Expressions</title>
<para>
-     A path expression can be a Boolean predicate, although the SQL/JSON
-     standard allows predicates only in filters.  This is necessary for
-     implementation of the <literal>@@</literal> operator. For example,
-     the following <type>jsonpath</type> expression is valid in
-     <productname>PostgreSQL</productname>:
+     As an extension to the SQL standard, a <productname>PostgreSQL</productname>
+     path expression can be a Boolean predicate, whereas the SQL standard allows
+     predicates only in filters. Where SQL standard path expressions return the
+     relevant contents of the queried JSON value, predicate path expressions
+     return the three-value three-valued result of the predicate:

Redundant "three-value" before "three-valued result".

+     <literal>true</literal>, <literal>false</literal>, or
+     <literal>unknown</literal>. Compare this filter <type>jsonpath</type>
+     exression:
<programlisting>
-$.track.segments[*].HR &lt; 70
+select jsonb_path_query(:'json'::jsonb, '$.track.segments ?(@[*].HR &gt; 130)');
+                                jsonb_path_query
+---------------------------------------------------------------------------------
+ {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}
</programlisting>
-    </para>
-   </listitem>
+     To a predicate expression, which returns <literal>true</literal>
+<programlisting>
+select jsonb_path_query(:'json'::jsonb, '$.track.segments[*].HR &gt; 130');
+ jsonb_path_query
+------------------
+ true
+</programlisting>
+     </para>
-   <listitem>
-    <para>
-     There are minor differences in the interpretation of regular
-     expression patterns used in <literal>like_regex</literal> filters, as
-     described in <xref linkend="jsonpath-regular-expressions"/>.
-    </para>
-   </listitem>
-  </itemizedlist>
+     <para>
+      Predicate-only path expressions are necessary for implementation of the
+      <literal>@@</literal> operator (and the
+      <function>jsonb_path_match</function> function), and should not be used
+      with the <literal>@?</literal> operator (or
+      <function>jsonb_path_exists</function> function).
+     </para>
+
+     <para>
+      Conversely, non-predicate <type>jsonpath</type> expressions should not be
+      used with the <literal>@@</literal> operator (or the
+      <function>jsonb_path_match</function> function).
+     </para>
+    </sect4>

Both paras should be wrapped in a single <note> so that they stand out
from the rest of the text. Maybe even <warning>, but <note> is already
used on this page for things that I'd consider warnings.

+    <sect4 id="jsonpath-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>

<sect3 id="devations-from-the-standard"> should be closed here,
otherwise the docs won't build. This can be checked with
`make -C doc/src/sgml check`.

<sect3 id="strict-and-lax-modes">
<title>Strict and Lax Modes</title>
@@ -17431,18 +17488,30 @@ $.track.segments[*].HR &lt; 70
abstract from the fact that it stores an array of segments
when using the lax mode:
<programlisting>
-lax $.track.segments.location
+ select jsonb_path_query(:'json'::jsonb, 'lax $.track.segments.location');
+ jsonb_path_query  

`git diff --check` shows a couple of lines with trailing whitespace
(mostly psql output).

+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
</programlisting>
</para>
<para>
-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of
the queried JSON document to return an SQL/JSON item, so using this
-    path expression will cause an error. To get the same result as in
-    the lax mode, you have to explicitly unwrap the
+    path expression will cause an error:
+<programlisting>
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments.location');
+ERROR:  jsonpath member accessor can only be applied to an object
+</programlisting>    
+    To get the same result as in the lax mode, you have to explicitly unwrap the
<literal>segments</literal> array:
<programlisting>
-strict $.track.segments[*].location
+select jsonb_path_query(:'json'::jsonb, 'strict $.track.segments[*].location');
+ jsonb_path_query  
+-------------------
+ [47.763, 13.4034]
+ [47.706, 13.2635]
</programlisting>
</para>
@@ -17451,7 +17520,13 @@ strict $.track.segments[*].location
when using the lax mode. For instance, the following query selects every
<literal>HR</literal> value twice:
<programlisting>
-lax $.**.HR
+select jsonb_path_query(:'json'::jsonb, 'lax $.**.HR');
+ jsonb_path_query 
+------------------
+ 73
+ 135
+ 73
+ 135
</programlisting>
This happens because the <literal>.**</literal> accessor selects both
the <literal>segments</literal> array and each of its elements, while
@@ -17460,7 +17535,11 @@ lax $.**.HR
the <literal>.**</literal> accessor only in the strict mode. The
following query selects each <literal>HR</literal> value just once:
<programlisting>
-strict $.**.HR
+select jsonb_path_query(:'json'::jsonb, 'strict $.**.HR');
+ jsonb_path_query 
+------------------
+ 73
+ 135
</programlisting>
</para>

--
Erik

#4David E. Wheeler
david@kineticode.com
In reply to: Erik Wienhold (#3)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 14, 2023, at 19:51, Erik Wienhold <ewie@ewie.name> wrote:

Thanks for putting this together. See my review at the end.

Appreciate the speedy review!

Nice. This really does help to make some sense of it. I checked all
queries and they do work out except for two queries where the path
expression string is not properly quoted (but the intended output is
still correct).

🤦🏻‍♂️

Follow-ups I’d like to make:

1. Expand the modes section to show how the types of results can vary
depending on the mode, thanks to the flattening. Examples:

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a ?(@[*] > 2)');
jsonb_path_query
------------------
3
4
5
(3 rows)

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.a ?(@[*] > 2)');
jsonb_path_query
------------------
[1, 2, 3, 4, 5]

2. Improve the descriptions and examples for @?/jsonb_path_exists()
and @@/jsonb_path_match().

+1

I planned to submit these changes in a separate patch, based on Tom Lane’s suggestion[1]/messages/by-id/1229727.1680535592@sss.pgh.pa.us. Would it be preferred to add them to this patch?

Perhaps make it explicit that the reader must run this in psql in order
to use \set and :'json' in the ensuing samples? Some of the existing
examples already use psql output but they do not rely on any psql
features.

Good call, done.

This should use <screen>, <userinput>, and <computeroutput> if it shows
a psql session, e.g.:

<screen>
<userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
<computeroutput>
jsonb_path_query
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]
</computeroutput>
</screen>

I pokwds around, and it appears the computeroutput bit is used for function output. So I followed the precedent in queries.sgml[2]https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-JOIN and omitted the computeroutput tags but added prompt, e.g.,

<screen>
<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.**.HR');</userinput>
jsonb_path_query
------------------
73
135
</screen>

Also the cast to jsonb is not necessary and only adds clutter IMO.

Right, removed them all in function calls.

+     <para>
+      Predicate-only path expressions are necessary for implementation of the
+      <literal>@@</literal> operator (and the
+      <function>jsonb_path_match</function> function), and should not be used
+      with the <literal>@?</literal> operator (or
+      <function>jsonb_path_exists</function> function).
+     </para>
+
+     <para>
+      Conversely, non-predicate <type>jsonpath</type> expressions should not be
+      used with the <literal>@@</literal> operator (or the
+      <function>jsonb_path_match</function> function).
+     </para>
+    </sect4>

Both paras should be wrapped in a single <note> so that they stand out
from the rest of the text. Maybe even <warning>, but <note> is already
used on this page for things that I'd consider warnings.

Agreed. Would be good if we could teach these functions and operators to reject path expressions they don’t support.

+    <sect4 id="jsonpath-regular-expression-deviation">
+    <title>Regular Expression Interpretation</title>
+     <para>
+      There are minor differences in the interpretation of regular
+      expression patterns used in <literal>like_regex</literal> filters, as
+      described in <xref linkend="jsonpath-regular-expressions"/>.
+     </para>
+    </sect4>

<sect3 id="devations-from-the-standard"> should be closed here,
otherwise the docs won't build. This can be checked with
`make -C doc/src/sgml check`.

Thanks. That produces a bunch of warnings for postgres.sgml and legal.sgml (and a failure to load the docbook DTD), but func.sgml is clean now.

`git diff --check` shows a couple of lines with trailing whitespace
(mostly psql output).

I must’ve cleaned those after I sent the patch, good now. Updated patch attached, this time created by `git format-patch -v2`.

Best,

David

[1]: /messages/by-id/1229727.1680535592@sss.pgh.pa.us
[2]: https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-JOIN

Attachments:

v2-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/octet-stream; name=v2-0001-Improve-boolean-predicate-JSON-Path-docs.patch; x-unix-mode=0644Download+154-71
#5Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#4)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-16 01:04 +0200, David E. Wheeler write:

On Oct 14, 2023, at 19:51, Erik Wienhold <ewie@ewie.name> wrote:

Thanks for putting this together. See my review at the end.

Appreciate the speedy review!

You're welcome.

Follow-ups I’d like to make:

1. Expand the modes section to show how the types of results can vary
depending on the mode, thanks to the flattening. Examples:

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a ?(@[*] > 2)');
jsonb_path_query
------------------
3
4
5
(3 rows)

david=# select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.a ?(@[*] > 2)');
jsonb_path_query
------------------
[1, 2, 3, 4, 5]

2. Improve the descriptions and examples for @?/jsonb_path_exists()
and @@/jsonb_path_match().

+1

I planned to submit these changes in a separate patch, based on Tom
Lane’s suggestion[1]. Would it be preferred to add them to this patch?

Your call but I'm not against including it in this patch because it
already touches the modes section.

I pokwds around, and it appears the computeroutput bit is used for
function output. So I followed the precedent in queries.sgml[2] and
omitted the computeroutput tags but added prompt, e.g.,
<screen>
<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.**.HR');</userinput>
jsonb_path_query
------------------
73
135
</screen>

Okay, Not sure what the preferred style is but I saw <userinput> and
<computeroutput> used together in doc/src/sgml/ref/createuser.sgml.
But it's not applied consistently in the rest of the docs.

+     <para>
+      Predicate-only path expressions are necessary for implementation of the
+      <literal>@@</literal> operator (and the
+      <function>jsonb_path_match</function> function), and should not be used
+      with the <literal>@?</literal> operator (or
+      <function>jsonb_path_exists</function> function).
+     </para>
+
+     <para>
+      Conversely, non-predicate <type>jsonpath</type> expressions should not be
+      used with the <literal>@@</literal> operator (or the
+      <function>jsonb_path_match</function> function).
+     </para>
+    </sect4>

Both paras should be wrapped in a single <note> so that they stand out
from the rest of the text. Maybe even <warning>, but <note> is already
used on this page for things that I'd consider warnings.

Agreed. Would be good if we could teach these functions and operators
to reject path expressions they don’t support.

Right, you mentioned that idea in [1]/messages/by-id/BAF11F2D-5EDD-4DBB-87FA-4F35845029AE@justatheory.com (separate types). Not sure what
the best strategy here is but it's likely to break existing queries.
Maybe deprecating unsupported path expressions in the next major release
and changing that to an error in the major release after that.

This can be checked with `make -C doc/src/sgml check`.

Thanks. That produces a bunch of warnings for postgres.sgml and
legal.sgml (and a failure to load the docbook DTD), but func.sgml is
clean now.

Hmm... I get no warnings on 1f89b73c4e. Did you install all tools as
described in [2]https://www.postgresql.org/docs/current/docguide-toolsets.html? The DTD needs to be installed as well.

[1]: /messages/by-id/BAF11F2D-5EDD-4DBB-87FA-4F35845029AE@justatheory.com
[2]: https://www.postgresql.org/docs/current/docguide-toolsets.html

--
Erik

#6David E. Wheeler
david@kineticode.com
In reply to: Erik Wienhold (#5)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 15, 2023, at 23:03, Erik Wienhold <ewie@ewie.name> wrote:

Your call but I'm not against including it in this patch because it
already touches the modes section.

Okay, added, let’s just put all our cards on the table. :-)

Agreed. Would be good if we could teach these functions and operators
to reject path expressions they don’t support.

Right, you mentioned that idea in [1] (separate types). Not sure what
the best strategy here is but it's likely to break existing queries.
Maybe deprecating unsupported path expressions in the next major release
and changing that to an error in the major release after that.

Well if the functions have a JsonPathItem struct, they can check its type attribute and reject those with a root type that’s a predicate in @? and reject it if it’s not a predicate in @@. Example of checking type here:

https://github.com/postgres/postgres/blob/54b208f90963cb8b48b9794a5392b2fae4b40a98/src/backend/utils/adt/jsonpath_exec.c#L622

This can be checked with `make -C doc/src/sgml check`.

Thanks. That produces a bunch of warnings for postgres.sgml and
legal.sgml (and a failure to load the docbook DTD), but func.sgml is
clean now.

Hmm... I get no warnings on 1f89b73c4e. Did you install all tools as
described in [2]? The DTD needs to be installed as well.

Thanks, got it down to one:

postgres.sgml:112: element sect4: validity error : Element sect4 content does not follow the DTD, expecting (sect4info? , (title , subtitle? , titleabbrev?) , (toc | lot | index | glossary | bibliography)* , (((calloutlist | glosslist | bibliolist | itemizedlist | orderedlist | segmentedlist | simplelist | variablelist | caution | important | note | tip | warning | literallayout | programlisting | programlistingco | screen | screenco | screenshot | synopsis | cmdsynopsis | funcsynopsis | classsynopsis | fieldsynopsis | constructorsynopsis | destructorsynopsis | methodsynopsis | formalpara | para | simpara | address | blockquote | graphic | graphicco | mediaobject | mediaobjectco | informalequation | informalexample | informalfigure | informaltable | equation | example | figure | table | msgset | procedure | sidebar | qandaset | task | anchor | bridgehead | remark | highlights | abstract | authorblurb | epigraph | indexterm | beginpage)+ , (refentry* | sect5* | simplesect*)) | refentry+ | sect5+ | simplesect+) , (toc | lot | index | glossary | bibliography)*), got (para para )
&func;

David

Attachments:

v3-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/applefile; name=v3-0001-Improve-boolean-predicate-JSON-Path-docs.patchDownload
#7Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#6)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-16 21:59 +0200, David E. Wheeler write:

On Oct 15, 2023, at 23:03, Erik Wienhold <ewie@ewie.name> wrote:

Your call but I'm not against including it in this patch because it
already touches the modes section.

Okay, added, let’s just put all our cards on the table. :-)

I'll have a look but the attached v3 is not a patch but some applefile.

Thanks, got it down to one:

postgres.sgml:112: element sect4: validity error : Element sect4 content does not follow the DTD, expecting (sect4info? , (title , subtitle? , titleabbrev?) , (toc | lot | index | glossary | bibliography)* , (((calloutlist | glosslist | bibliolist | itemizedlist | orderedlist | segmentedlist | simplelist | variablelist | caution | important | note | tip | warning | literallayout | programlisting | programlistingco | screen | screenco | screenshot | synopsis | cmdsynopsis | funcsynopsis | classsynopsis | fieldsynopsis | constructorsynopsis | destructorsynopsis | methodsynopsis | formalpara | para | simpara | address | blockquote | graphic | graphicco | mediaobject | mediaobjectco | informalequation | informalexample | informalfigure | informaltable | equation | example | figure | table | msgset | procedure | sidebar | qandaset | task | anchor | bridgehead | remark | highlights | abstract | authorblurb | epigraph | indexterm | beginpage)+ , (refentry* | sect5* | simplesect*)) | refentry+ | sect5+ | simplesect+) , (toc | lot | index | glossary | bibliography)*), got (para para )
&func;

One of the added <sect4> is invalid by the looks of it. Maybe <title>
is missing because it says "got (para para )" at the end.

--
Erik

#8David E. Wheeler
david@kineticode.com
In reply to: Erik Wienhold (#7)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 16, 2023, at 18:07, Erik Wienhold <ewie@ewie.name> wrote:

Okay, added, let’s just put all our cards on the table. :-)

I'll have a look but the attached v3 is not a patch but some applefile.

Weird, should be no different from previous attachments. I believe Apple Mail always uses application/octet-stream for attachments it doesn’t recognize, which includes .patch and .diff files, sadly.

One of the added <sect4> is invalid by the looks of it. Maybe <title>
is missing because it says "got (para para )" at the end.

Oh, I thought it would report issues from the files they were found in. You’re right, I forgot a title. Fixed in v4.

David

Attachments:

v4-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/octet-stream; name=v4-0001-Improve-boolean-predicate-JSON-Path-docs.patch; x-unix-mode=0644Download+227-95
#9jian he
jian.universality@gmail.com
In reply to: David E. Wheeler (#8)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Tue, Oct 17, 2023 at 10:56 AM David E. Wheeler <david@justatheory.com> wrote:

Oh, I thought it would report issues from the files they were found in. You’re right, I forgot a title. Fixed in v4.

David

+        Returns the result of a JSON path
+        <link linkend="boolean-predicate-path-expressions">predicate
+        check</link> for the specified JSON value. If the result is
not Boolean,
+        then <literal>NULL</literal> is returned. Do not use with non-predicate
+        JSON path expressions.

"Do not use with non-predicate", double negative is not easy to
comprehend. Maybe we can simplify it.

16933: value. Use only SQL-standard JSON path expressions, not not
there are two "not".

15842: SQL-standard JSON path expressions, not not
there are two "not".

#10David E. Wheeler
david@kineticode.com
In reply to: jian he (#9)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 19, 2023, at 01:22, jian he <jian.universality@gmail.com> wrote:

"Do not use with non-predicate", double negative is not easy to
comprehend. Maybe we can simplify it.

16933: value. Use only SQL-standard JSON path expressions, not not
there are two "not".

15842: SQL-standard JSON path expressions, not not
there are two "not”.

Thank you, jian. Updated patch attached and also on GitHub.

https://github.com/postgres/postgres/compare/master...theory:postgres:jsonpath-pred-docs

Best,

David

Attachments:

v5-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/applefile; name=v5-0001-Improve-boolean-predicate-JSON-Path-docs.patchDownload
#11Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#10)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-19 15:39 +0200, David E. Wheeler wrote:

On Oct 19, 2023, at 01:22, jian he <jian.universality@gmail.com> wrote:

Updated patch attached and also on GitHub.

https://github.com/postgres/postgres/compare/master...theory:postgres:jsonpath-pred-docs

Just wanted to take a look at v5. But it's an applefile again :P

--
Erik

#12David E. Wheeler
david@kineticode.com
In reply to: Erik Wienhold (#11)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 19, 2023, at 10:49 PM, Erik Wienhold <ewie@mailbox.org> wrote:

Just wanted to take a look at v5. But it's an applefile again :P

I don’t get it. It was the other times too! Are you able to save it with a .patch suffix?

D

#13Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#12)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-20 05:20 +0200, David E. Wheeler wrote:

On Oct 19, 2023, at 10:49 PM, Erik Wienhold <ewie@mailbox.org> wrote:

Just wanted to take a look at v5. But it's an applefile again :P

I don’t get it. It was the other times too! Are you able to save it
with a .patch suffix?

Saving it is not the problem, but the actual file contents:

$ xxd v5-0001-Improve-boolean-predicate-JSON-Path-docs.patch
00000000: 0005 1600 0002 0000 0000 0000 0000 0000 ................
00000010: 0000 0000 0000 0000 0002 0000 0009 0000 ................
00000020: 0032 0000 000a 0000 0003 0000 003c 0000 .2...........<..
00000030: 0036 0000 0000 0000 0000 0000 7635 2d30 .6..........v5-0
00000040: 3030 312d 496d 7072 6f76 652d 626f 6f6c 001-Improve-bool
00000050: 6561 6e2d 7072 6564 6963 6174 652d 4a53 ean-predicate-JS
00000060: 4f4e 2d50 6174 682d 646f 6373 2e70 6174 ON-Path-docs.pat
00000070: 6368 ch

I don't even know what that represents, probably not some fancy file
compression.

--
Erik

#14David E. Wheeler
david@kineticode.com
In reply to: Erik Wienhold (#13)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 19, 2023, at 23:49, Erik Wienhold <ewie@ewie.name> wrote:

I don't even know what that represents, probably not some fancy file
compression.

Oh, weird. Trying from a webmail client instead.

Best,

David

Attachments:

v5-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/octet-stream; name="=?UTF-8?Q?v5-0001-Improve-boolean-predicate-JSON-Path-docs.patch?="Download+229-95
#15Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#14)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-20 15:49 +0200, David Wheeler wrote:

On Oct 19, 2023, at 23:49, Erik Wienhold <ewie@ewie.name> wrote:

I don't even know what that represents, probably not some fancy file
compression.

That's an AppleSingle file according to [1]https://www.rfc-editor.org/rfc/rfc1740.txt[2]https://web.archive.org/web/20180311140826/http://kaiser-edv.de/documents/AppleSingle_AppleDouble.pdf. It only contains the
resource fork and file name but no data fork.

Oh, weird. Trying from a webmail client instead.

Thanks.

+        Does JSON path return any item for the specified JSON value? Use only
+        SQL-standard JSON path expressions, not
+        <link linkend="boolean-predicate-path-expressions">predicate check
+        expressions.</link>

Any reason for calling it "predicate check expressions" (e.g. the link
text) and sometimes "predicate path expressions" (e.g. the linked
section title)? I think it should be named consistently to avoid
confusion and also to simplify searching.

+        Returns the result of a JSON path
+        <link linkend="boolean-predicate-path-expressions">predicate
+        check</link> for the specified JSON value. If the result is not Boolean,
+        then <literal>NULL</literal> is returned. Use only with
+        <link linkend="boolean-predicate-path-expressions">predicate check
+        expressions.</link>

Linking the same section twice in the same paragraph seems excessive.

+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
+select jsonb_path_query(:'json', '$.track.segments');

Please remove the second SELECT.

+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[0].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]

Strict mode is unnecessary to get that result and I'd omit it because
the different modes are not introduced yet at this point.

+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments.size()');</userinput>
+ jsonb_path_query
+------------------
+ 2

Strict mode is unnecessary here as well.

+     using the lax mode. To avoid surprising results, we recommend using
+     the <literal>.**</literal> accessor only in the strict mode. The

Please change to "in strict mode" (without "the").

[1]: https://www.rfc-editor.org/rfc/rfc1740.txt
[2]: https://web.archive.org/web/20180311140826/http://kaiser-edv.de/documents/AppleSingle_AppleDouble.pdf

--
Erik

#16David E. Wheeler
david@kineticode.com
In reply to: Erik Wienhold (#15)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 22, 2023, at 20:36, Erik Wienhold <ewie@ewie.name> wrote:

That's an AppleSingle file according to [1][2]. It only contains the
resource fork and file name but no data fork.

Ah, I had “Send large attachments with Mail Drop” enabled. To me 20K is not big but whatever. Let’s see if turning it off fixes the issue.

Any reason for calling it "predicate check expressions" (e.g. the link
text) and sometimes "predicate path expressions" (e.g. the linked
section title)? I think it should be named consistently to avoid
confusion and also to simplify searching.

I think "predicate path expressions” is more descriptive, but "predicate check expressions” is what was in the docs before, so let’s stick with that.

Linking the same section twice in the same paragraph seems excessive.

Fair. Will link the second one.

+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', '$.track.segments');</userinput>
+select jsonb_path_query(:'json', '$.track.segments');

Please remove the second SELECT.

Done.

+<prompt>=&gt;</prompt> <userinput>select jsonb_path_query(:'json', 'strict $.track.segments[0].location');</userinput>
+ jsonb_path_query
+-------------------
+ [47.763, 13.4034]

Strict mode is unnecessary to get that result and I'd omit it because
the different modes are not introduced yet at this point.

Yep, pasto.

Strict mode is unnecessary here as well.

Fixed.

+     using the lax mode. To avoid surprising results, we recommend using
+     the <literal>.**</literal> accessor only in the strict mode. The

Please change to "in strict mode" (without "the").

Hrm, I prefer it without the article, too, but it is consistently used that way elsewhere, like here:

https://github.com/postgres/postgres/blob/5b36e8f/doc/src/sgml/func.sgml#L17401

I’d be happy to change them all, but was keeping it consistent for now.

Updated patch attached, thank you!

David

Attachments:

v6-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/octet-stream; name=v6-0001-Improve-boolean-predicate-JSON-Path-docs.patch; x-unix-mode=0644Download+226-95
#17Erik Wienhold
ewie@ewie.name
In reply to: David E. Wheeler (#16)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On 2023-10-24 00:58 +0200, David E. Wheeler wrote:

On Oct 22, 2023, at 20:36, Erik Wienhold <ewie@ewie.name> wrote:

That's an AppleSingle file according to [1][2]. It only contains the
resource fork and file name but no data fork.

Ah, I had “Send large attachments with Mail Drop” enabled. To me 20K
is not big but whatever. Let’s see if turning it off fixes the issue.

I suspected it had something to do with iCloud. Glad you solved it!

Please change to "in strict mode" (without "the").

Hrm, I prefer it without the article, too, but it is consistently used
that way elsewhere, like here:

https://github.com/postgres/postgres/blob/5b36e8f/doc/src/sgml/func.sgml#L17401

I’d be happy to change them all, but was keeping it consistent for now.

Right. I haven't really noticed that the article case is more common.
I thought that you may have missed that one because I saw this change
that removes the article:

-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of

Updated patch attached, thank you!

LGTM. Would you create a commitfest entry? I'll set the status to RfC.

--
Erik

#18David E. Wheeler
david@kineticode.com
In reply to: Erik Wienhold (#17)
Re: Patch: Improve Boolean Predicate JSON Path Docs

On Oct 23, 2023, at 20:20, Erik Wienhold <ewie@ewie.name> wrote:

I thought that you may have missed that one because I saw this change
that removes the article:

-    In the strict mode, the specified path must exactly match the structure of
+    In strict mode, the specified path must exactly match the structure of

Oh, didn’t realize. Fixed.

LGTM. Would you create a commitfest entry? I'll set the status to RfC.

Done.

https://commitfest.postgresql.org/45/4624/

Best,

David

Attachments:

v7-0001-Improve-boolean-predicate-JSON-Path-docs.patchapplication/octet-stream; name=v7-0001-Improve-boolean-predicate-JSON-Path-docs.patch; x-unix-mode=0644Download+227-96
#19shihao zhong
zhong950419@gmail.com
In reply to: David E. Wheeler (#18)
Re: Patch: Improve Boolean Predicate JSON Path Docs

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: tested, passed

I took a look for this commit, it looks correct to me

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#18)
Re: Patch: Improve Boolean Predicate JSON Path Docs

"David E. Wheeler" <david@justatheory.com> writes:

[ v7-0001-Improve-boolean-predicate-JSON-Path-docs.patch ]

I started to review this, and got bogged down at

@@ -17203,9 +17214,12 @@ array w/o UK? | t

   <para>
    For example, suppose you have some JSON data from a GPS tracker that you
-   would like to parse, such as:
+   would like to parse, such as this JSON, set up as a
+   <link linkend="app-psql-meta-command-set"><application>psql</application>
+   <command>\set</command> variable</link> for use as <literal>:'json'</literal>
+   in the examples below:
 <programlisting>
-{
+ \set json '{
   "track": {
     "segments": [
       {

I find the textual change rather unwieldy, but the bigger problem is
that this example doesn't actually work. If you try to copy-and-paste
this into psql, you get "unterminated quoted string", because psql
metacommands can't span line boundaries.

Perhaps we could leave the existing display alone, and then add

To follow the examples below, paste this into psql:
<programlisting>
\set json '{ "track": { "segments": [ { "location": [ 47.763, 13.4034 ], "start time": "2018-10-14 10:05:14", "HR": 73 }, { "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21", "HR": 135 } ] }}'
</programlisting>
This will allow <literal>:'json'</literal> to be expanded into the
above JSON value, plus suitable quoting.

However, I'm not sure that's a great solution, because it's going to
line-wrap on most displays, making copy-and-paste a bit iffy.

I experimented with

SELECT '
... multiline json value ...
' AS json
\gexec

but that didn't seem to work either. Anybody have a better idea?

regards, tom lane

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#20)
#22Erik Wienhold
ewie@ewie.name
In reply to: Tom Lane (#20)
#23David E. Wheeler
david@kineticode.com
In reply to: Erik Wienhold (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#24)
#26David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#25)
#27David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#24)
#28David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#25)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#26)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#28)
#31David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#30)
#32David E. Wheeler
david@kineticode.com
In reply to: David E. Wheeler (#31)
#33David E. Wheeler
david@kineticode.com
In reply to: David E. Wheeler (#32)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#33)
#35David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#34)
#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#35)
#37David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#36)