Further issues with jsonb semantics, documentation

Started by Peter Geogheganover 10 years ago44 messages

pg@heroku.com

over 10 years ago

1 attachment(s)

I've noticed some more issues with the jsonb documentation, and the
new jsonb stuff generally. I didn't set out to give Andrew feedback on
the semantics weeks after feature freeze, but unfortunately this feels
like another discussion that we need to have now rather than later.

"operator jsonb - integer"
===================

Summary: I think that this operator has a problem, but a problem that
can easily be fixed.

I think it was a bad idea to allow array-style removal of object
key/value pairs. ISTM that it implies a level of stability in the
ordering that doesn't make sense. Besides, is it really all that
useful?

Consider this case:

postgres=# select '{"c":5, "a":6, "b":7}'::jsonb - 1;
?column?
------------------
{"a": 6, "c": 5}
(1 row)

Clearly anyone expecting the value "a" to be removed here would be in
for a surprise. Moreover, it is inconsistent with the established
behavior of the corresponding array-wise subscript operator:

postgres=# select '{"c":5, "a":6, "b":7}'::jsonb -> 1;
?column?
----------
[null]
(1 row)

I suggest, given that this is conceptually a data-modifying operator,
that the minus operator/jsonb_delete() case raise an error rather than
matching "operator jsonb -> integer" and returning NULL. I say this as
the person who successfully argued that the -> operator case above
should return NULL during the 9.4 beta period; returning SQL NULL for
the delete/minus operator feels like going too far in the direction of
permissiveness, even for jsonb; my expression index argument does not
apply here as it did for the "operator jsonb -> integer" case.

"operator jsonb - text"
================

Summary: I think that this operator is fine.

Documentation needs work, though. The "operator jsonb - text" operator
ought to be documented as in the attached patch, which is closer to
the equivalent hstore operator, and emphasizes the "operator jsonb ?
text" definition of a key. It should emphasize its similarity to the
established "operator jsonb ? text" operator, and in particular that
array elements behave as keys *iff* they're strings.

"operator jsonb - text[]" (and *nested* deletion more generally)
===============================================

Summary: I think that this operator has many problems, and should be
scraped (although only as an operator). IMV nested deletion should
only be handled by functions, and the way that nested deletion works
in general should be slightly adjusted.

The new "operator jsonb - text[]" operator is confusingly inconsistent with:

A) "operator jsonb text"

and:

B) the established "operator hstore - text[]" operator, since that
operator deletes all key/value pairs that have keys that match any of
the right operand text array values. In contrast, this new operator is
passed as its right operand an array of text elements that constitute
a "path" (so the order in the rhs text[] operand matters). If the text
element in the rhs text[] operand happens to be what would pass for a
Postgres integer literal, it can be used to traverse lhs array values
through subscripting at that nesting level.

Regarding nested deletion behavior more generally, consider this
example of how this can work out badly:

postgres=# select jsonb_delete(jsonb_set('["a"]', '{5}', '"b"'), '{5}') ;
jsonb_delete
--------------
["a", "b"]
(1 row)

Here, we're adding and then deleting an array element at offset 5 (the
string "b"). But the element is never deleted by the outer
jsonb_delete(), because we can't rely on the element actually being
stored at offset 5. Seems a bit fragile.

More importantly, consider the inconsistency with "operator jsonb
text" ("point A" above):

postgres=# select '["a"]'::jsonb ?| '{a}'::text[]; -- historic/9.4 behavior
?column?
----------
t
(1 row)

postgres=# select '["a"]'::jsonb - '{a}'::text[]; -- new to 9.5
operator, does not delete!
?column?
----------
["a"]
(1 row)

Perhaps most questionably of all, the non-array based minus/delete
operator (which I like) *does* have the same idea of matching a key as
the established "operator jsonb ?| text[]" operator (and "operator
jsonb ? text", etc):

postgres=# select '["a"]'::jsonb - 'a'::text; -- new to 9.5 operator,
*does* delete!
?column?
----------
[]
(1 row)

This conceptual model for manipulating jsonb is entirely new and novel
to this new operator "operator text[]" (and jsonb_set()).

"operator jsonb - text[]" categorization/conceptual model
==========================================

Operators like the established "operator jsonb -> integer" operator (a
jsonb "array-wise" operator) always seemed okay to me because the rhs
operand really was a Postgres integer, and because it's explicitly an
array-wise operator (just like "operator -> text" is explicitly
object-wise). But now, with these new operators, you've added a
"shadow type" system to certain rhs text[] operands, consisting of
types not explicitly delineated by JSON-style double quotes (for
strings, say). So there is kind of a second shadow type system in
play, similar to that of jsonb except that text[] "shadow types"
cannot be distinguished -- '{0}'::text[] could be intended to affect
an array or an object with the key value "0" in one of its pairs.

Accordingly, I would vote for changing this, and making the nested
deletion stuff only care about object key values and array *string*
elements. This backs out of the idea of a new "shadow type" system for
text arrays. I think that this is conceptually a lot cleaner, while
actually not being significantly less useful for most use cases
(nesting tends to involve only objects anyway). As noted in my summary
of the previous section, I would also vote for scraping "operator
jsonb - text[]" as a jsonb_delete() wrapper (see closing summary below
for more on why).

While I appreciate that Andrew wanted to make deletion as flexible as
possible, these inconsistencies feel arbitrary to me.

Closing summary
=============

At the very least, some of these new jsonb operators need to decide if they're:

1) Specifically "array-wise" or "object-wise", like the existing
operators "operator -> integer", or "operator -> text".

or:

2) "Key" operators. Operators that share the "operator jsonb ? text"
idea of a key, and operate on jsonb datums accordingly. "Keys" here
include array elements that are strings.

As I've said, I prefer option 2 for deletion, which is why I like
"operator jsonb - text" but not "operator jsonb - text[]", but either
way it needs to be *a lot* clearer than it is at the moment.

You might also say that there is a category 3 jsonb operator, of which
the containment operator ("operator @>") is the best example. It takes
jsonb on the rhs, and so naturally cares about types including
container types on the lhs. Clearly these new operators are far enough
away from category 3 that we cannot really contemplate moving them
into category 3, particularly post feature freeze (even though, as I
said, I think that would be a superior approach).

Some of what I've said here is just my opinion, but I feel pretty
confident that for example we don't want to add a fourth "operator
category" to jsonb, a weird hybrid between category 2 and category 3.
Having 3 "operator categories" seems quite enough. And by making
*nested* deletion only possible through a function call to
jsonb_delete() (by scrapping "operator jsonb - text[]" as I suggest),
you also avoid having an *operator* that is still somewhat not like
other operators in category 2 (since the other operators in category 2
don't care about nesting). That feels cleaner -- IMV the oddness of
walking a path based on a text[] value ought to be confined to a well
named function. Plus you can then add a new "operator jsonb - text[]"
that matches "operator hstore - text[]" and comports with the existing
"operator jsonb - text".

If you don't like my taxonomy for jsonb operators, then feel free to
come up with your own. As things stand, it is hard to describe a
taxonomy that makes the operators easy to understand and
non-surprising -- there'd be more exceptions than rules. I feel we
need to be disciplined about this stuff, or jsonb will become much
harder to use than it needs to be.

Opinions?
--
Peter Geoghegan

Attachments:

jsonb-delete-doc.patchtext/x-patch; charset=US-ASCII; name=jsonb-delete-doc.patchDownload

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 95d5a3a..a1b8ecc 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10302,8 +10302,9 @@ table2-mapping
        <row>
         <entry><literal>-</literal></entry>
         <entry><type>text</type></entry>
-        <entry>Delete the field with a specified key, or element with this
-        value</entry>
+        <entry>Delete key/value pair or <emphasis>string</emphasis>
+        element from left operand.  Key/value pairs are matched based
+        on their key value.</entry>
         <entry><literal>'{"a": "b"}'::jsonb - 'a' </literal></entry>
        </row>
        <row>

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Peter Geoghegan (#1)

Re: Further issues with jsonb semantics, documentation

On Wed, Jun 3, 2015 at 7:02 PM, Peter Geoghegan <pg@heroku.com> wrote:

Consider this case:

postgres=# select '{"c":5, "a":6, "b":7}'::jsonb - 1;
?column?
------------------
{"a": 6, "c": 5}
(1 row)

Clearly anyone expecting the value "a" to be removed here would be in
for a surprise. Moreover, it is inconsistent with the established
behavior of the corresponding array-wise subscript operator:

postgres=# select '{"c":5, "a":6, "b":7}'::jsonb -> 1;
?column?
----------
[null]
(1 row)

For similar reasons, I think that this inconsistency is unacceptable:

postgres=# select '["a", "b", "c"]'::jsonb - -1;
?column?
------------
["a", "b"]
(1 row)

postgres=# select '["a", "b", "c"]'::jsonb -> -1;
?column?
----------
[null]
(1 row)

jsonb now supports Python-style negative subscripting to index
backward. I think that this a fine idea. However, I also think it's a
big POLA violation that this was not done for the ordinary array
subscripting operator ("operator jsonb -> integer") at the same time
as "operator jsonb - integer" was added. Although doing this will
require a compatibility note in the 9.5 release notes, it's extremely
unlikely to destabilize anybody's app, and makes a lot of sense.
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Peter Geoghegan (#1)

Re: Further issues with jsonb semantics, documentation

On 06/03/2015 10:02 PM, Peter Geoghegan wrote:

I've noticed some more issues with the jsonb documentation, and the
new jsonb stuff generally. I didn't set out to give Andrew feedback on
the semantics weeks after feature freeze, but unfortunately this feels
like another discussion that we need to have now rather than later.

Yes, I wish you had raised these issues months ago when this was
published. That's the way the process is supposed to work.

"operator jsonb - integer"
===================

Summary: I think that this operator has a problem, but a problem that
can easily be fixed.

I think it was a bad idea to allow array-style removal of object
key/value pairs. ISTM that it implies a level of stability in the
ordering that doesn't make sense. Besides, is it really all that
useful?

The origin of this is nested hstore. Looking at my last version of that
patch, I see:

SELECT 'a=>1, b=>2, c=>3'::hstore - 3;
?column?
------------------------
"a"=>1, "b"=>2, "c"=>3
(1 row)

But I agree that it's not a great contribution to science, especially
since the index will be applied to the list of elements in the somewhat
counter-intuitive storage order we use, and we could just raise an error
if we try to apply integer delete to an object instead of an array.

"operator jsonb - text[]" (and *nested* deletion more generally)
===============================================

Summary: I think that this operator has many problems, and should be
scraped (although only as an operator). IMV nested deletion should
only be handled by functions, and the way that nested deletion works
in general should be slightly adjusted.

The new "operator jsonb - text[]" operator is confusingly inconsistent with:

A) "operator jsonb text"

What exactly is this? I have no idea what you're talking about.

and:

B) the established "operator hstore - text[]" operator, since that
operator deletes all key/value pairs that have keys that match any of
the right operand text array values. In contrast, this new operator is
passed as its right operand an array of text elements that constitute
a "path" (so the order in the rhs text[] operand matters). If the text
element in the rhs text[] operand happens to be what would pass for a
Postgres integer literal, it can be used to traverse lhs array values
through subscripting at that nesting level.

The fact that hstore uses it that way doesn't really concern me. Since
hstore isn't nested it doesn't make a whole lot of sense for it to mean
anything else there. But json(b) is nested, and jsonb - path seems quite
a reasonable treatment, something you're much more likely to want to do
than removeing top level elements in bulk.

Regarding nested deletion behavior more generally, consider this
example of how this can work out badly:

postgres=# select jsonb_delete(jsonb_set('["a"]', '{5}', '"b"'), '{5}') ;
jsonb_delete
--------------
["a", "b"]
(1 row)

Here, we're adding and then deleting an array element at offset 5 (the
string "b"). But the element is never deleted by the outer
jsonb_delete(), because we can't rely on the element actually being
stored at offset 5. Seems a bit fragile.

The behaviour of jsonb_set is pretty explicitly documented. If we wanted
to do something else then we'd have to disable the special meaning given
to negative indices, but that would mean in turn we wouldn't be able to
prepend to an array.

More importantly, consider the inconsistency with "operator jsonb
text" ("point A" above):

postgres=# select '["a"]'::jsonb ?| '{a}'::text[]; -- historic/9.4 behavior
?column?
----------
t
(1 row)

postgres=# select '["a"]'::jsonb - '{a}'::text[]; -- new to 9.5
operator, does not delete!
?column?
----------
["a"]
(1 row)

You are conflating two different things here, quite pointlessly. The RH
operand of ?| is not a path, whereas the RH operand of this - variant
is. The fact that they are both text arrays doesn't mean that they
should mean the same thing. And this is really the whole problem with
the rest of your analysis.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Jim Nasby

Jim.Nasby@BlueTreble.com

over 10 years ago

In reply to: Andrew Dunstan (#3)

Re: Further issues with jsonb semantics, documentation

On 6/4/15 8:43 AM, Andrew Dunstan wrote:

You are conflating two different things here, quite pointlessly. The RH
operand of ?| is not a path, whereas the RH operand of this - variant
is. The fact that they are both text arrays doesn't mean that they
should mean the same thing. And this is really the whole problem with
the rest of your analysis.

Has the idea of a specific json_path datatype been discussed? I feel it
would add a lot of clarity to the operators. It would also make it easy
to have an array of paths, something that's difficult to do today
because a path can be an arbitrary length and arrays don't support that.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Jim Nasby (#4)

Re: Further issues with jsonb semantics, documentation

On 06/04/2015 11:33 AM, Jim Nasby wrote:

On 6/4/15 8:43 AM, Andrew Dunstan wrote:

You are conflating two different things here, quite pointlessly. The RH
operand of ?| is not a path, whereas the RH operand of this - variant
is. The fact that they are both text arrays doesn't mean that they
should mean the same thing. And this is really the whole problem with
the rest of your analysis.

Has the idea of a specific json_path datatype been discussed? I feel
it would add a lot of clarity to the operators. It would also make it
easy to have an array of paths, something that's difficult to do today
because a path can be an arbitrary length and arrays don't support that.

I actually thought of doing something like that earlier today, although
I was thinking of making it an array under the hood - I'm not sure how
much call there is for an array of paths. We could probably finesse
that. I agree that there is some sense in having such a type, especially
if we later want to implement json(b)_patch, see
<http://tools.ietf.org/html/rfc6902>. And if we do we should call the
type json_pointer to be consistent with
<http://tools.ietf.org/html/rfc6901>.

However, this is certainly not 9.5 material.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Andrew Dunstan (#3)

Re: Further issues with jsonb semantics, documentation

On Thu, Jun 4, 2015 at 6:43 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I've noticed some more issues with the jsonb documentation, and the
new jsonb stuff generally. I didn't set out to give Andrew feedback on
the semantics weeks after feature freeze, but unfortunately this feels
like another discussion that we need to have now rather than later.

Yes, I wish you had raised these issues months ago when this was published.
That's the way the process is supposed to work.

I also wish that I managed to do that. As you know, I was working
overtime to get UPSERT into 9.5 during that period. Finding time to
review things is always difficult, and I which I could do more.

"operator jsonb - integer"
===================

I think it was a bad idea to allow array-style removal of object
key/value pairs. ISTM that it implies a level of stability in the
ordering that doesn't make sense. Besides, is it really all that
useful?

But I agree that it's not a great contribution to science, especially since
the index will be applied to the list of elements in the somewhat
counter-intuitive storage order we use, and we could just raise an error if
we try to apply integer delete to an object instead of an array.

Cool. Do you want to write a patch, or should I?

Also, what about negative array subscripting (making the 9.4-era
"operator jsonb -> integer" operator support that for consistency with
the new "operator jsonb - integer" operator)? Should I write the
patch? Will you commit it if I do?

"operator jsonb - text[]" (and *nested* deletion more generally)
===============================================

Summary: I think that this operator has many problems, and should be
scraped (although only as an operator). IMV nested deletion should
only be handled by functions, and the way that nested deletion works
in general should be slightly adjusted.

The new "operator jsonb - text[]" operator is confusingly inconsistent
with:

A) "operator jsonb text"

What exactly is this? I have no idea what you're talking about.

It's a typo -- I meant "operator jsonb - text". The fact that
"operator jsonb - text" and "operator jsonb - text[]" diverge in the
way they do seems confusing.

The fact that hstore uses it that way doesn't really concern me. Since
hstore isn't nested it doesn't make a whole lot of sense for it to mean
anything else there.

It seems pretty obvious to me that it makes just as much sense as in
hstore. In hstore, you might want to delete multiple key/value pairs
at once, for exactly the same reason as you might want to with jsonb.
Certainly, you'll also want to support nested deletion with jsonb, but
that's beside the point.

But json(b) is nested, and jsonb - path seems quite a
reasonable treatment, something you're much more likely to want to do than
removeing top level elements in bulk.

Probably true. I think that this interface for nested deletion is
complicated enough and inconsistent enough that I'd rather not have an
operator at all, just a function (so somewhat like jsonb_set() --
jsonb_delete()). That is my main point on "operator jsonb - text[]";
I think the interface is complicated and inconsistent with everything
else for no good reason.

Regarding nested deletion behavior more generally, consider this
example of how this can work out badly:

postgres=# select jsonb_delete(jsonb_set('["a"]', '{5}', '"b"'), '{5}') ;
jsonb_delete
--------------
["a", "b"]
(1 row)

Here, we're adding and then deleting an array element at offset 5 (the
string "b"). But the element is never deleted by the outer
jsonb_delete(), because we can't rely on the element actually being
stored at offset 5. Seems a bit fragile.

The behaviour of jsonb_set is pretty explicitly documented. If we wanted to
do something else then we'd have to disable the special meaning given to
negative indices, but that would mean in turn we wouldn't be able to prepend
to an array.

jsonb_delete() should certainly be able to traverse objects, but it's
much less clear that it should be able to *traverse* arrays (affecting
arrays is a different story, though). That's why I proposed not
supporting traversing arrays with it or with jsonb_set(). This would
also removes the questionable second "shadow type system" within the
text[] rhs operand too, which seems like a good thing.

I think that traversing arrays in nested documents is a rare
requirement, because the ordering within arrays is unstable. If you
already know the ordinal number of the thing you want to nuke, then
you probably have already locked the row, and you might as well
manipulate the JSON using Javascript or Python at that stage.

Making jsonb_delete() buy into the "operator jsonb ? text" idea of a
key (a thing that it must delete) would also allow jsonb_delete() to
reliably delete particular strings in arrays, which actually does make
a lot of sense (think of arrays of "tags"). But FWIW it's the
inconsistency that bothers me most.

More importantly, consider the inconsistency with "operator jsonb
text" ("point A" above):

postgres=# select '["a"]'::jsonb ?| '{a}'::text[]; -- historic/9.4
behavior
?column?
----------
t
(1 row)

postgres=# select '["a"]'::jsonb - '{a}'::text[]; -- new to 9.5
operator, does not delete!
?column?
----------
["a"]
(1 row)

You are conflating two different things here, quite pointlessly. The RH
operand of ?| is not a path, whereas the RH operand of this - variant is.
The fact that they are both text arrays doesn't mean that they should mean
the same thing. And this is really the whole problem with the rest of your
analysis.

I'm not conflating anything -- obviously I understand that "operator
jsonb - text[]" is not supposed to be an "any" variant of "operator
jsonb - text". My point is that a reasonable person could infer that
it *is* an "any" variant of "operator jsonb - text", based on hstore's
behavior for its two deletion operators, and based on the other jsonb
operators, and based on the fact that it will often behave that way
anyway.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Alvaro Herrera

alvherre@2ndquadrant.com

over 10 years ago

In reply to: Peter Geoghegan (#6)

Re: Further issues with jsonb semantics, documentation

I'm just skimming here, but if a jsonb_path type is being proposed,
perhaps it would be better not to have operators that take text or
text[] as second argument. We can provide that functionality with just
functions. For example, it will be confusing to have

jsonb 'some json value' - '{foo,bar}'

operate too differently from

jsonb 'some json value' - json_path '{foo,bar}'

And it will be a nasty regression to have 9.5 allow
jsonb 'some json value' - '{foo,bar}'
and then have 9.6 error out with "ambiguous operator" when the json_path
thing is added.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Alvaro Herrera (#7)

Re: Further issues with jsonb semantics, documentation

On Thu, Jun 4, 2015 at 12:16 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

I'm just skimming here, but if a jsonb_path type is being proposed,
perhaps it would be better not to have operators that take text or
text[] as second argument. We can provide that functionality with just
functions. For example, it will be confusing to have

jsonb 'some json value' - '{foo,bar}'

operate too differently from

jsonb 'some json value' - json_path '{foo,bar}'

And it will be a nasty regression to have 9.5 allow
jsonb 'some json value' - '{foo,bar}'
and then have 9.6 error out with "ambiguous operator" when the json_path
thing is added.

Fair point, but FWIW I don't think it'll end up being a new type like
json_path -- it'll just be jsonb, as with containment. I can see there
being an operator that performs deletion in a very similar way to how
"operator jsonb @> jsonb" performs containment (recall that jsonb
containment is a very JSON-ish flavor of containment).

I would like these new-to-9.5 deletion operators to work at the top
level only, like "operator jsonb ? text" and "operator jsonb ?| text",
sharing their idea of a key, __including that string array elements
are keys__. We haven't got a containment-style nested delete operator
for 9.5, but we can hope for it in the future. In the meantime, you
get much of the benefit of that with jsonb_delete(), which *can*
support nested deletion. It does so by buying into the "operator jsonb
? text" idea of a key (including that string array elements are keys),
although with a twist: the paths text[] right operand operates at
multiple nesting levels (not supporting traversing arrays, as Andrew
implemented it, but OTOH adding support for deleting String array
elements based on the string alone, useful for "tag" arrays).

If in 9.6 we have something like an "operator jsonb @- jsonb" operator
for containment style deletion, and a 9.5 era "operator jsonb - text"
and operator jsonb - text[]" pair of operators for existence style
deletion (matching "operator jsonb ? text", operating only on the top
level), that will be pretty good. The fact that jsonb_delete() will
have somewhat bridged the gap nesting-deletion-wise for 9.5 (without
being usable through an operator) won't really matter then. I want to
keep the "twist" I described out of any jsonb operators that are
shipped, and only use it within functions.
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Peter Geoghegan (#8)

Re: Further issues with jsonb semantics, documentation

On Thu, Jun 4, 2015 at 1:02 PM, Peter Geoghegan <pg@heroku.com> wrote:

I would like these new-to-9.5 deletion operators to work at the top
level only, like "operator jsonb ? text" and "operator jsonb ?| text",
sharing their idea of a key, __including that string array elements
are keys__. We haven't got a containment-style nested delete operator
for 9.5, but we can hope for it in the future. In the meantime, you
get much of the benefit of that with jsonb_delete(), which *can*
support nested deletion. It does so by buying into the "operator jsonb
? text" idea of a key (including that string array elements are keys),
although with a twist: the paths text[] right operand operates at
multiple nesting levels (not supporting traversing arrays, as Andrew
implemented it, but OTOH adding support for deleting String array
elements based on the string alone, useful for "tag" arrays).

If in 9.6 we have something like an "operator jsonb @- jsonb" operator
for containment style deletion, and a 9.5 era "operator jsonb - text"
and operator jsonb - text[]" pair of operators for existence style
deletion (matching "operator jsonb ? text", operating only on the top
level), that will be pretty good. The fact that jsonb_delete() will
have somewhat bridged the gap nesting-deletion-wise for 9.5 (without
being usable through an operator) won't really matter then. I want to
keep the "twist" I described out of any jsonb operators that are
shipped, and only use it within functions.

To be clear: these two paragraphs are a proposal about how I'd like to
change things for 9.5 to make the jsonb operators more consistent than
the way things are in the master branch, while still offering nested
deletion through a function.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

David E. Wheeler

david@justatheory.com

over 10 years ago

In reply to: Alvaro Herrera (#7)

1 attachment(s)

Re: Further issues with jsonb semantics, documentation

On Jun 4, 2015, at 12:16 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

I'm just skimming here, but if a jsonb_path type is being proposed,

Is this not the purpose of JSQuery?

https://code.google.com/p/gwtquery/wiki/JsQuery

David

#11

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: David E. Wheeler (#10)

Re: Further issues with jsonb semantics, documentation

On 06/04/2015 04:13 PM, David E. Wheeler wrote:

On Jun 4, 2015, at 12:16 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

I'm just skimming here, but if a jsonb_path type is being proposed,

Is this not the purpose of JSQuery?

https://code.google.com/p/gwtquery/wiki/JsQuery

No, it doesn't seem to have anything at all to do with it. What I
suggested would be an implementation of json_pointer - see
<http://tools.ietf.org/html/rfc6901>

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Peter Geoghegan (#6)

Re: Further issues with jsonb semantics, documentation

On 06/04/2015 03:10 PM, Peter Geoghegan wrote:

On Thu, Jun 4, 2015 at 6:43 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I've noticed some more issues with the jsonb documentation, and the
new jsonb stuff generally. I didn't set out to give Andrew feedback on
the semantics weeks after feature freeze, but unfortunately this feels
like another discussion that we need to have now rather than later.

Yes, I wish you had raised these issues months ago when this was published.
That's the way the process is supposed to work.

I also wish that I managed to do that. As you know, I was working
overtime to get UPSERT into 9.5 during that period. Finding time to
review things is always difficult, and I which I could do more.

That's happened to me in the past. My view has generally been that in
that case I have missed my chance, and I need to live with what others
have done. That seems to me preferable to tearing up any pretense we
might have to be following a defined development process.

I should point out that I have already gone out of my way to accommodate
concerns you expressed extremely late about this set of features, and I
have lately indicated another area where we can adjust it to meet your
objections. Re-litigating this wholesale seems quite a different kettle
of fish, however.

Just in case it's not clear: I am not at all happy.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Peter Geoghegan (#6)

Re: Further issues with jsonb semantics, documentation

On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:

jsonb_delete() should certainly be able to traverse objects, but it's
much less clear that it should be able to *traverse* arrays (affecting
arrays is a different story, though). That's why I proposed not
supporting traversing arrays with it or with jsonb_set(). This would
also removes the questionable second "shadow type system" within the
text[] rhs operand too, which seems like a good thing.

Here is a further example of why I find this new "shadow type" system
for rhs text[] operands to be pretty questionable:

postgres=# select jsonb_set('[1, 2, 3, 4, 5,6,7,8,9,10,11,12]',
'{"5e10"}'::text[], '"Input unsanitized"') ;
jsonb_set
-----------------------------------------------------------
[1, 2, 3, 4, 5, "Input unsanitized", 7, 8, 9, 10, 11, 12]
(1 row)

BTW, there is a bug here -- strtol() needs additional defenses [1]https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer -- Peter Geoghegan
(before casting to int):

postgres=# select jsonb_set('[1, 2, 3, 4,
5,6,7,8,9,10,11,12,13,14,15,16,17,18]',
'{"9223372036854775806"}'::text[], '"Input unsanitized"', false) ;
jsonb_set
----------------------------------------------------------------------------------
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, "Input
unsanitized", 18]
(1 row)

[1]: https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer -- Peter Geoghegan
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Andrew Dunstan (#12)

Re: Further issues with jsonb semantics, documentation

On Thu, Jun 4, 2015 at 5:31 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

Just in case it's not clear: I am not at all happy.

I've offered to help you with several of the issue I raised; I had
intended to offer more help.

The issues I raise seem pretty substantive to me. I'm trying to make
sure that we don't end up with something bad that we need to live with
indefinitely. I have offered you something not far off an "everybody
wins" proposal (i.e. no real loss of functionality), and that was my
first proposal.

I don't know what more I could do for you. I *am* trying to help.
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Alvaro Herrera (#7)

Re: Further issues with jsonb semantics, documentation

On 06/04/2015 03:16 PM, Alvaro Herrera wrote:

I'm just skimming here, but if a jsonb_path type is being proposed,
perhaps it would be better not to have operators that take text or
text[] as second argument. We can provide that functionality with just
functions. For example, it will be confusing to have

jsonb 'some json value' - '{foo,bar}'

operate too differently from

jsonb 'some json value' - json_path '{foo,bar}'

And it will be a nasty regression to have 9.5 allow
jsonb 'some json value' - '{foo,bar}'
and then have 9.6 error out with "ambiguous operator" when the json_path
thing is added.

The boat has sailed on this. We have had the #> and #>> operators since
9.3, i.e. even before we got the operators that Peter wants us to adopt
the usage from, and their right hand operands are text arrays with the
same path semantics.

'some jsonb value' - '{foo,bar}' is already ambiguous - the RH operand
could be a single text datum or a text array.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Peter Geoghegan (#6)

Re: Further issues with jsonb semantics, documentation

On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:

But I agree that it's not a great contribution to science, especially since
the index will be applied to the list of elements in the somewhat
counter-intuitive storage order we use, and we could just raise an error if
we try to apply integer delete to an object instead of an array.

Cool. Do you want to write a patch, or should I?

Also, what about negative array subscripting (making the 9.4-era
"operator jsonb -> integer" operator support that for consistency with
the new "operator jsonb - integer" operator)? Should I write the
patch? Will you commit it if I do?

Please let me know if you want me to write these two patches.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Peter Geoghegan (#16)

Re: Further issues with jsonb semantics, documentation

On 06/05/2015 01:39 PM, Peter Geoghegan wrote:

On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:

But I agree that it's not a great contribution to science, especially since
the index will be applied to the list of elements in the somewhat
counter-intuitive storage order we use, and we could just raise an error if
we try to apply integer delete to an object instead of an array.

Cool. Do you want to write a patch, or should I?

Also, what about negative array subscripting (making the 9.4-era
"operator jsonb -> integer" operator support that for consistency with
the new "operator jsonb - integer" operator)? Should I write the
patch? Will you commit it if I do?

Please let me know if you want me to write these two patches.

Send the first one, I'm still thinking about the second one.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Alvaro Herrera

alvherre@2ndquadrant.com

over 10 years ago

In reply to: Andrew Dunstan (#15)

Re: Further issues with jsonb semantics, documentation

Andrew Dunstan wrote:

On 06/04/2015 03:16 PM, Alvaro Herrera wrote:

I'm just skimming here, but if a jsonb_path type is being proposed,
perhaps it would be better not to have operators that take text or
text[] as second argument. We can provide that functionality with just
functions. For example, it will be confusing to have

jsonb 'some json value' - '{foo,bar}'

operate too differently from

jsonb 'some json value' - json_path '{foo,bar}'

And it will be a nasty regression to have 9.5 allow
jsonb 'some json value' - '{foo,bar}'
and then have 9.6 error out with "ambiguous operator" when the json_path
thing is added.

The boat has sailed on this. We have had the #> and #>> operators since 9.3,
i.e. even before we got the operators that Peter wants us to adopt the usage
from, and their right hand operands are text arrays with the same path
semantics.

Well, some boats sailed, but maybe those were different boats. I don't
think we should shut discussion off only because we made some choice or
other in the past. Since we haven't released yet, we can base decisions
on what's the most useful API for users, rather on what got committed in
the initial patch.

'some jsonb value' - '{foo,bar}' is already ambiguous - the RH operand
could be a single text datum or a text array.

Hmm, but that's not in 9.4, so we can still tweak it if necessary.

Consider this jsonb datum. Nobody in their right mind would have a key
that looks like a path, I hear you say; yet I'm sure this is going to
happen.

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}, "{c,a}": "uh"}' ;
jsonb
------------------------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}, "{c,a}": "uh"}
(1 fila)

This seems pretty surprising to me:

-- here, the -(jsonb,text) operator is silently chosen, even though the
-- right operand looks like an array. And we do the wrong thing.
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}';
?column?
---------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)

-- here, the -(jsonb,text[]) operator is chosen
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - _text '{c,a}';
?column?
-------------------------------
{"a": "1", "b": "2", "c": {}}
(1 fila)

But this seems worse to me, because we silently do nothing:

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}';
?column?
---------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)

I think the first operator can be qualified as dangerous. If you delete
that one, then it's fine because you can't do that query anymore because
of the conflict with -(jsonb, int).

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}';
ERROR: operator is not unique: jsonb - unknown
Lï¿½NEA 1: ...elect jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}'...
^
SUGERENCIA: Could not choose a best candidate operator. You might need to add explicit type casts.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Petr Jelinek

petr.jelinek@2ndquadrant.com

over 10 years ago

In reply to: Alvaro Herrera (#18)

Re: Further issues with jsonb semantics, documentation

On Fri, Jun 5, 2015 at 8:32 , Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

Andrew Dunstan wrote:

'some jsonb value' - '{foo,bar}' is already ambiguous - the RH
operand
could be a single text datum or a text array.

Hmm, but that's not in 9.4, so we can still tweak it if necessary.

Consider this jsonb datum. Nobody in their right mind would have a
key
that looks like a path, I hear you say; yet I'm sure this is going to
happen.

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}, "{c,a}":
"uh"}' ;
jsonb
------------------------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}, "{c,a}": "uh"}
(1 fila)

This seems pretty surprising to me:

-- here, the -(jsonb,text) operator is silently chosen, even though
the
-- right operand looks like an array. And we do the wrong thing.
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
'{c,a}';
?column?
---------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)

-- here, the -(jsonb,text[]) operator is chosen
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
_text '{c,a}';
?column?
-------------------------------
{"a": "1", "b": "2", "c": {}}
(1 fila)

But this seems worse to me, because we silently do nothing:

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
'{c,a}';
?column?
---------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)

I think the first operator can be qualified as dangerous. If you
delete
that one, then it's fine because you can't do that query anymore
because
of the conflict with -(jsonb, int).

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
'{c,a}';
ERROR: operator is not unique: jsonb - unknown
LÍNEA 1: ...elect jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
'{c,a}'...
^
SUGERENCIA: Could not choose a best candidate operator. You might
need to add explicit type casts.

That's a good point, and it won't get any better if/when we add the
json point support in 9.6 since the syntax would be something like
select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '/c/a'; and we
will again silently do nothing. That's going to cause bugs in
applications using this.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Jim Nasby

Jim.Nasby@BlueTreble.com

over 10 years ago

In reply to: Petr Jelinek (#19)

Re: Further issues with jsonb semantics, documentation

On 6/5/15 2:08 PM, Petr Jelinek wrote:

That's a good point, and it won't get any better if/when we add the json
point support in 9.6 since the syntax would be something like select
jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '/c/a'; and we will again
silently do nothing. That's going to cause bugs in applications using this.

Yeah, this is a miniature version of the pain I've felt with variant:
trying to get sane casting for a data type that encompasses other types
in current Postgres is essentially impossible. Your only option is to
put implicit or assignment casts in and cross your fingers, or to do
only explicit casts and force the user to cast everything (which is a
PITA). Even a json_pointer type may not help this much unless we have
some way to reliable transform an unknown into a json_pointer.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Alvaro Herrera

alvherre@2ndquadrant.com

over 10 years ago

In reply to: Jim Nasby (#20)

Re: Further issues with jsonb semantics, documentation

Jim Nasby wrote:

On 6/5/15 2:08 PM, Petr Jelinek wrote:

That's a good point, and it won't get any better if/when we add the json
point support in 9.6 since the syntax would be something like select
jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '/c/a'; and we will again
silently do nothing. That's going to cause bugs in applications using this.

Yeah, this is a miniature version of the pain I've felt with variant: trying
to get sane casting for a data type that encompasses other types in current
Postgres is essentially impossible.

I'm not sure this is the same problem. But anyway I think requiring
explicit casts in this stuff is a good thing -- relying on implicit
cast to text, when most useful behavior uses other types, seems bad.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Alvaro Herrera (#18)

Re: Further issues with jsonb semantics, documentation

On 06/05/2015 02:32 PM, Alvaro Herrera wrote:

'some jsonb value' - '{foo,bar}' is already ambiguous - the RH operand
could be a single text datum or a text array.

Hmm, but that's not in 9.4, so we can still tweak it if necessary.

Consider this jsonb datum. Nobody in their right mind would have a key
that looks like a path, I hear you say; yet I'm sure this is going to
happen.

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}, "{c,a}": "uh"}' ;
jsonb
------------------------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}, "{c,a}": "uh"}
(1 fila)

This seems pretty surprising to me:

-- here, the -(jsonb,text) operator is silently chosen, even though the
-- right operand looks like an array. And we do the wrong thing.
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}';
?column?
---------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)

-- here, the -(jsonb,text[]) operator is chosen
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - _text '{c,a}';
?column?
-------------------------------
{"a": "1", "b": "2", "c": {}}
(1 fila)

But this seems worse to me, because we silently do nothing:

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}';
?column?
---------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)

I think the first operator can be qualified as dangerous. If you delete
that one, then it's fine because you can't do that query anymore because
of the conflict with -(jsonb, int).

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}';
ERROR: operator is not unique: jsonb - unknown
LÍNEA 1: ...elect jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}'...
^
SUGERENCIA: Could not choose a best candidate operator. You might need to add explicit type casts.

Yeah, Good point. Actually, if my memory serves me correctly (always a
dubious bet), the avoidance of that kind of ambiguity is why we
introduced the #> and #>> operators in the first place, after going
round and round for a while on what the API would look like. I should
have remembered that when this came around. Mea culpa.

So probably the least invasive change would be to rename the text[]
variant operator to something like "#-" and rename the corresponding
function to jsonb_delete_path.

We could also decide not to keep an operator at all, on the ground that
we think we'll implement a type that encapsulates json pointer in 9.6,
and just keep the renamed function.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

Tom Lane

tgl@sss.pgh.pa.us

over 10 years ago

In reply to: Andrew Dunstan (#22)

Re: Further issues with jsonb semantics, documentation

Andrew Dunstan <andrew@dunslane.net> writes:

Yeah, Good point. Actually, if my memory serves me correctly (always a
dubious bet), the avoidance of that kind of ambiguity is why we
introduced the #> and #>> operators in the first place, after going
round and round for a while on what the API would look like. I should
have remembered that when this came around. Mea culpa.

So probably the least invasive change would be to rename the text[]
variant operator to something like "#-" and rename the corresponding
function to jsonb_delete_path.

Not sure that's a great choice of operator name; consider for example
select 4#-1;
It's not immediately obvious whether the "-" is meant as a separate
unary minus. There are heuristics in the lexer that try to deal with
cases like this, but it doesn't seem like a good plan to double down
on such heuristics always doing the right thing.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Andrew Dunstan (#17)

1 attachment(s)

Re: Further issues with jsonb semantics, documentation

On Fri, Jun 5, 2015 at 10:51 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Also, what about negative array subscripting (making the 9.4-era
"operator jsonb -> integer" operator support that for consistency with
the new "operator jsonb - integer" operator)? Should I write the
patch? Will you commit it if I do?

Send the first one, I'm still thinking about the second one.

The first patch is attached.

Regardless of anything else, I see no reason to delay applying my
documentation patch for "operator jsonb - text" [1]/messages/by-id/CAM3SWZQFSWMi2aVi-Lun_JBYh-RfHQ3-0fm8TXpW8OLc+v8ZnQ@mail.gmail.com -- Peter Geoghegan.

Thanks

[1]: /messages/by-id/CAM3SWZQFSWMi2aVi-Lun_JBYh-RfHQ3-0fm8TXpW8OLc+v8ZnQ@mail.gmail.com -- Peter Geoghegan
--
Peter Geoghegan

Attachments:

0001-Desupport-jsonb-subscript-deletion-on-objects.patchtext/x-patch; charset=US-ASCII; name=0001-Desupport-jsonb-subscript-deletion-on-objects.patchDownload

From 6513017eabbd4bdd4980056ed73ca8e3fbe58d1b Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <peter.geoghegan86@gmail.com>
Date: Fri, 5 Jun 2015 13:55:48 -0700
Subject: [PATCH] Desupport jsonb subscript deletion on objects

Supporting deletion of JSON pairs within jsonb objects using an
array-style integer subscript allowed for surprising outcomes.  This was
mostly due to the implementation-defined ordering of pairs within
objects for jsonb.

It also seems desirable to make jsonb integer subscript deletion
consistent with the 9.4 era general purpose integer subscripting
operator for jsonb (although that operator returns NULL when an object
is encountered, while we prefer to throw an error).
---
 doc/src/sgml/func.sgml                |  5 ++--
 src/backend/utils/adt/jsonfuncs.c     |  5 ++++
 src/test/regress/expected/jsonb.out   | 56 ++---------------------------------
 src/test/regress/expected/jsonb_1.out | 56 ++---------------------------------
 src/test/regress/sql/jsonb.sql        | 11 +------
 5 files changed, 13 insertions(+), 120 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c6e3540..be0c3d6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10309,8 +10309,9 @@ table2-mapping
        <row>
         <entry><literal>-</literal></entry>
         <entry><type>integer</type></entry>
-        <entry>Delete the field or element with specified index (Negative
-        integers count from the end)</entry>
+        <entry>Delete the array element with specified index (Negative
+        integers count from the end).  Throws an error if top level
+        container is not an array.</entry>
         <entry><literal>'["a", "b"]'::jsonb - 1 </literal></entry>
        </row>
        <row>
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index f87ba77..c14d3f7 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -3400,6 +3400,11 @@ jsonb_delete_idx(PG_FUNCTION_ARGS)
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("cannot delete from scalar")));
 
+	if (JB_ROOT_IS_OBJECT(in))
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot delete from object using integer subscript")));
+
 	if (JB_ROOT_COUNT(in) == 0)
 		PG_RETURN_JSONB(in);
 
diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out
index 412bf97..e6654d4 100644
--- a/src/test/regress/expected/jsonb.out
+++ b/src/test/regress/expected/jsonb.out
@@ -3031,54 +3031,6 @@ select '["a","b","c"]'::jsonb - -4;
  ["a", "b", "c"]
 (1 row)
 
-select '{"a":1, "b":2, "c":3}'::jsonb - 3;
-         ?column?         
---------------------------
- {"a": 1, "b": 2, "c": 3}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - 2;
-     ?column?     
-------------------
- {"a": 1, "b": 2}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - 1;
-     ?column?     
-------------------
- {"a": 1, "c": 3}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - 0;
-     ?column?     
-------------------
- {"b": 2, "c": 3}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - -1;
-     ?column?     
-------------------
- {"a": 1, "b": 2}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - -2;
-     ?column?     
-------------------
- {"a": 1, "c": 3}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - -3;
-     ?column?     
-------------------
- {"b": 2, "c": 3}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - -4;
-         ?column?         
---------------------------
- {"a": 1, "b": 2, "c": 3}
-(1 row)
-
 select jsonb_set('{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb, '{n}', '[1,2,3]');
                                 jsonb_set                                 
 --------------------------------------------------------------------------
@@ -3192,12 +3144,8 @@ select '[]'::jsonb - 'a';
 
 select '"a"'::jsonb - 1; -- error
 ERROR:  cannot delete from scalar
-select '{}'::jsonb -  1 ;
- ?column? 
-----------
- {}
-(1 row)
-
+select '{}'::jsonb -  1; -- error
+ERROR:  cannot delete from object using integer subscript
 select '[]'::jsonb - 1;
  ?column? 
 ----------
diff --git a/src/test/regress/expected/jsonb_1.out b/src/test/regress/expected/jsonb_1.out
index 4ead74b..0a1ec93 100644
--- a/src/test/regress/expected/jsonb_1.out
+++ b/src/test/regress/expected/jsonb_1.out
@@ -3031,54 +3031,6 @@ select '["a","b","c"]'::jsonb - -4;
  ["a", "b", "c"]
 (1 row)
 
-select '{"a":1, "b":2, "c":3}'::jsonb - 3;
-         ?column?         
---------------------------
- {"a": 1, "b": 2, "c": 3}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - 2;
-     ?column?     
-------------------
- {"a": 1, "b": 2}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - 1;
-     ?column?     
-------------------
- {"a": 1, "c": 3}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - 0;
-     ?column?     
-------------------
- {"b": 2, "c": 3}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - -1;
-     ?column?     
-------------------
- {"a": 1, "b": 2}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - -2;
-     ?column?     
-------------------
- {"a": 1, "c": 3}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - -3;
-     ?column?     
-------------------
- {"b": 2, "c": 3}
-(1 row)
-
-select '{"a":1, "b":2, "c":3}'::jsonb - -4;
-         ?column?         
---------------------------
- {"a": 1, "b": 2, "c": 3}
-(1 row)
-
 select jsonb_set('{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb, '{n}', '[1,2,3]');
                                 jsonb_set                                 
 --------------------------------------------------------------------------
@@ -3192,12 +3144,8 @@ select '[]'::jsonb - 'a';
 
 select '"a"'::jsonb - 1; -- error
 ERROR:  cannot delete from scalar
-select '{}'::jsonb -  1 ;
- ?column? 
-----------
- {}
-(1 row)
-
+select '{}'::jsonb -  1; -- error
+ERROR:  cannot delete from object using integer subscript
 select '[]'::jsonb - 1;
  ?column? 
 ----------
diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql
index 2abec22..29c82a2 100644
--- a/src/test/regress/sql/jsonb.sql
+++ b/src/test/regress/sql/jsonb.sql
@@ -738,15 +738,6 @@ select '["a","b","c"]'::jsonb - -2;
 select '["a","b","c"]'::jsonb - -3;
 select '["a","b","c"]'::jsonb - -4;
 
-select '{"a":1, "b":2, "c":3}'::jsonb - 3;
-select '{"a":1, "b":2, "c":3}'::jsonb - 2;
-select '{"a":1, "b":2, "c":3}'::jsonb - 1;
-select '{"a":1, "b":2, "c":3}'::jsonb - 0;
-select '{"a":1, "b":2, "c":3}'::jsonb - -1;
-select '{"a":1, "b":2, "c":3}'::jsonb - -2;
-select '{"a":1, "b":2, "c":3}'::jsonb - -3;
-select '{"a":1, "b":2, "c":3}'::jsonb - -4;
-
 select jsonb_set('{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb, '{n}', '[1,2,3]');
 select jsonb_set('{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb, '{b,-1}', '[1,2,3]');
 select jsonb_set('{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb, '{d,1,0}', '[1,2,3]');
@@ -775,7 +766,7 @@ select '"a"'::jsonb - 'a'; -- error
 select '{}'::jsonb - 'a';
 select '[]'::jsonb - 'a';
 select '"a"'::jsonb - 1; -- error
-select '{}'::jsonb -  1 ;
+select '{}'::jsonb -  1; -- error
 select '[]'::jsonb - 1;
 select '"a"'::jsonb - '{a}'::text[]; -- error
 select '{}'::jsonb - '{a}'::text[];
-- 
1.9.1

#25

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Andrew Dunstan (#22)

Re: Further issues with jsonb semantics, documentation

On Fri, Jun 5, 2015 at 1:05 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

So probably the least invasive change would be to rename the text[] variant
operator to something like "#-" and rename the corresponding function to
jsonb_delete_path.

We could also decide not to keep an operator at all, on the ground that we
think we'll implement a type that encapsulates json pointer in 9.6, and just
keep the renamed function.

Obviously I prefer the latter option, but the former is still an
improvement. To repeat myself, ambiguities around operators are not
the only problem: It seems no good to me that there is no way to
accomplish an equivalent outcome to that shown below with the
similarly-spelled operator you talk about (that is, the operator
currently spelled "operator jsonb - text[]"):

postgres=# select '["a", "c", "a"]'::jsonb - 'a';
?column?
----------
["c"]
(1 row)

With the operator currently spelled "operator jsonb - text[]", at the
very least you have to do this instead:

postgres=# select '["a", "c", "a"]'::jsonb - '{0}'::text[] - '{1}'::text[];
?column?
----------
["c"]
(1 row)

If nothing else, these operators are too dissimilar for overloading to
be helpful.
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Tom Lane (#23)

Re: Further issues with jsonb semantics, documentation

On 06/05/2015 04:48 PM, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Yeah, Good point. Actually, if my memory serves me correctly (always a
dubious bet), the avoidance of that kind of ambiguity is why we
introduced the #> and #>> operators in the first place, after going
round and round for a while on what the API would look like. I should
have remembered that when this came around. Mea culpa.
So probably the least invasive change would be to rename the text[]
variant operator to something like "#-" and rename the corresponding
function to jsonb_delete_path.

Not sure that's a great choice of operator name; consider for example
select 4#-1;
It's not immediately obvious whether the "-" is meant as a separate
unary minus. There are heuristics in the lexer that try to deal with
cases like this, but it doesn't seem like a good plan to double down
on such heuristics always doing the right thing.

Perhaps we should deprectae operator names ending in "-"?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Andrew Dunstan (#17)

Re: Further issues with jsonb semantics, documentation

On 06/05/2015 01:51 PM, Andrew Dunstan wrote:

On 06/05/2015 01:39 PM, Peter Geoghegan wrote:

On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:

But I agree that it's not a great contribution to science,
especially since
the index will be applied to the list of elements in the somewhat
counter-intuitive storage order we use, and we could just raise an
error if
we try to apply integer delete to an object instead of an array.

Cool. Do you want to write a patch, or should I?

Also, what about negative array subscripting (making the 9.4-era
"operator jsonb -> integer" operator support that for consistency with
the new "operator jsonb - integer" operator)? Should I write the
patch? Will you commit it if I do?

Please let me know if you want me to write these two patches.

Send the first one, I'm still thinking about the second one.

Sorry for the delay on this. I've been mostly off the grid, having an
all too rare visit from Tom "Mr Enum" Dunstan, and I misunderstood what
you were suggesting,

Please submit a patch to adjust the treatment of negative integers in
the old functions to be consistent with their treatment in the new
functions. i.e. in the range [-n,-1] they should refer to the
corresponding element counting from the right.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Andrew Dunstan (#27)

Re: Further issues with jsonb semantics, documentation

On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Sorry for the delay on this. I've been mostly off the grid, having an all
too rare visit from Tom "Mr Enum" Dunstan, and I misunderstood what you were
suggesting,

Thank you for working with me to address this. I've been busy with
other things the past few days too.

Please submit a patch to adjust the treatment of negative integers in the
old functions to be consistent with their treatment in the new functions.
i.e. in the range [-n,-1] they should refer to the corresponding element
counting from the right.

I've almost finished that patch. I'm currently blocked on deciding
what to do about the old path-orientated operators (#> and #>> for
json and jsonb types). It's rather painful to support pushing down
negative subscripting there -- maybe we should just not do so for
those variants, especially given that they're already notationally
inconsistent with the other operators that I'll be updating. What do
you think?

Maybe I'll come up with a simpler way of making that work by taking a
fresh look at it, but haven't done that yet.

My current, draft approach to making subscripting work with the json
variants (not the jsonb variants) is to use a second get_worker() call
in the event of a negative subscript, while making the first such call
(the existing get_worker() call) establish the number of top-level
array elements. That isn't beautiful, and involves some amount of
redundant work, but it's probably better than messing with
get_worker() in a more invasive way. Besides, this second get_worker()
call only needs to happen in the event of a negative subscript, and
I'm only really updating this (that is, updating routines like
json_array_element()) to preserve consistency with jsonb. What do you
think of that idea?

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Peter Geoghegan (#28)

Re: Further issues with jsonb semantics, documentation

On 06/10/2015 04:02 PM, Peter Geoghegan wrote:

On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Sorry for the delay on this. I've been mostly off the grid, having an all
too rare visit from Tom "Mr Enum" Dunstan, and I misunderstood what you were
suggesting,

Thank you for working with me to address this. I've been busy with
other things the past few days too.

Please submit a patch to adjust the treatment of negative integers in the
old functions to be consistent with their treatment in the new functions.
i.e. in the range [-n,-1] they should refer to the corresponding element
counting from the right.

I've almost finished that patch. I'm currently blocked on deciding
what to do about the old path-orientated operators (#> and #>> for
json and jsonb types). It's rather painful to support pushing down
negative subscripting there -- maybe we should just not do so for
those variants, especially given that they're already notationally
inconsistent with the other operators that I'll be updating. What do
you think?

Maybe I'll come up with a simpler way of making that work by taking a
fresh look at it, but haven't done that yet.

My current, draft approach to making subscripting work with the json
variants (not the jsonb variants) is to use a second get_worker() call
in the event of a negative subscript, while making the first such call
(the existing get_worker() call) establish the number of top-level
array elements. That isn't beautiful, and involves some amount of
redundant work, but it's probably better than messing with
get_worker() in a more invasive way. Besides, this second get_worker()
call only needs to happen in the event of a negative subscript, and
I'm only really updating this (that is, updating routines like
json_array_element()) to preserve consistency with jsonb. What do you
think of that idea?

Just took a quick look. My impression is that the jsonb case should be
fairly easy. If the index is negative, add JB_ROOT_COUNT(container) to
it and use that as the argument to getIthJsonbValueFromContainer().

I agree that the json case looks a bit nasty. Maybe a better approach
would be to provide a function that, given a JsonLexContext, returns the
number of array elements of the current array. In get_array_start we
could call that if the relevant path element is negative and adjust it
accordingly.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Andrew Dunstan (#29)

1 attachment(s)

Re: Further issues with jsonb semantics, documentation

On 06/12/2015 12:29 PM, I wrote:

I agree that the json case looks a bit nasty. Maybe a better approach
would be to provide a function that, given a JsonLexContext, returns
the number of array elements of the current array. In get_array_start
we could call that if the relevant path element is negative and adjust
it accordingly.

Here's some code for the count piece of that.

cheers

andrew

Attachments:

json_count_array.patchtext/x-diff; name=json_count_array.patchDownload

diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index 26d3843..a11b965 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -341,6 +341,53 @@ pg_parse_json(JsonLexContext *lex, JsonSemAction *sem)
 }
 
 /*
+ * json_count_array
+ *
+ * designed to be called from the array_start routine
+ *
+ * gives the number of elements in the array at whatever nesting level it is at.
+ *
+ */
+int
+json_count_array_elements(JsonLexContext *lex)
+{
+	JsonLexContext copylex;
+	int count = 0;
+
+	if (lex_peek(lex) != JSON_TOKEN_ARRAY_START)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), /* XX FIX */
+				 errmsg("json_count_array must be called at the start of an array")));
+
+	/*
+	 * It's safe to do this with a shallow copy because the lexical routines
+	 * don't scribble on the input. They do scribble on the other pointers etc,
+	 * so doing this with a copy makes that safe.
+	 */
+	memcpy(&copylex, lex, sizeof(JsonLexContext));
+	copylex.strval = NULL; /* not interested in values here */
+
+	copylex.lex_level++;
+
+	lex_expect(JSON_PARSE_ARRAY_START, &copylex, JSON_TOKEN_ARRAY_START);
+	if (lex_peek(&copylex) != JSON_TOKEN_ARRAY_END)
+	{
+		count++;
+		parse_array_element(&copylex, &nullSemAction);
+
+		while (lex_accept(&copylex, JSON_TOKEN_COMMA, NULL))
+		{
+			count++;
+			parse_array_element(&copylex, &nullSemAction);
+		}
+	}
+
+	lex_expect(JSON_PARSE_ARRAY_NEXT, &copylex, JSON_TOKEN_ARRAY_END);
+
+	return count;
+}
+
+/*
  *	Recursive Descent parse routines. There is one for each structural
  *	element in a json document:
  *	  - scalar (string, number, true, false, null)
diff --git a/src/include/utils/jsonapi.h b/src/include/utils/jsonapi.h
index 296d20a..1abb8b9 100644
--- a/src/include/utils/jsonapi.h
+++ b/src/include/utils/jsonapi.h
@@ -104,6 +104,13 @@ typedef struct JsonSemAction
 extern void pg_parse_json(JsonLexContext *lex, JsonSemAction *sem);
 
 /*
+ * json_count_array does a fast secondary parse to determine the number
+ * of elements in the current array. It must be called from an array_start
+ * action.
+ */
+extern int json_count_array_elements(JsonLexContext *lex);
+
+/*
  * constructors for JsonLexContext, with or without strval element.
  * If supplied, the strval element will contain a de-escaped version of
  * the lexeme. However, doing this imposes a performance penalty, so

#31

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Andrew Dunstan (#30)

1 attachment(s)

Re: Further issues with jsonb semantics, documentation

On Fri, Jun 12, 2015 at 12:26 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

Here's some code for the count piece of that.

Thanks. I'll look into integrating this with what I have.

BTW, on reflection I'm not so sure about my decision to not touch the
logic within jsonb_delete_idx() (commit b81c7b409). I probably should
have changed it in line with the attached patch as part of that
commit. What do you think?
--
Peter Geoghegan

Attachments:

0001-Remove-dead-code-for-jsonb-subscript-deletion.patchtext/x-patch; charset=US-ASCII; name=0001-Remove-dead-code-for-jsonb-subscript-deletion.patchDownload

From 8232f360a0696eb9279c29dfa7464edde726c5ae Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <peter.geoghegan86@gmail.com>
Date: Fri, 5 Jun 2015 13:55:48 -0700
Subject: [PATCH 1/2] Remove dead code for jsonb subscript deletion

---
 src/backend/utils/adt/jsonfuncs.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index c14d3f7..3fb8327 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -3411,10 +3411,8 @@ jsonb_delete_idx(PG_FUNCTION_ARGS)
 	it = JsonbIteratorInit(&in->root);
 
 	r = JsonbIteratorNext(&it, &v, false);
-	if (r == WJB_BEGIN_ARRAY)
-		n = v.val.array.nElems;
-	else
-		n = v.val.object.nPairs;
+	Assert (r == WJB_BEGIN_ARRAY);
+	n = v.val.array.nElems;
 
 	if (idx < 0)
 	{
@@ -3431,14 +3429,10 @@ jsonb_delete_idx(PG_FUNCTION_ARGS)
 
 	while ((r = JsonbIteratorNext(&it, &v, true)) != 0)
 	{
-		if (r == WJB_ELEM || r == WJB_KEY)
+		if (r == WJB_ELEM)
 		{
 			if (i++ == idx)
-			{
-				if (r == WJB_KEY)
-					JsonbIteratorNext(&it, &v, true);	/* skip value */
 				continue;
-			}
 		}
 
 		res = pushJsonbValue(&state, r, r < WJB_BEGIN_ARRAY ? &v : NULL);
-- 
1.9.1

#32

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Peter Geoghegan (#13)

1 attachment(s)

Re: Further issues with jsonb semantics, documentation

On Thu, Jun 4, 2015 at 5:43 PM, Peter Geoghegan <pg@heroku.com> wrote:

BTW, there is a bug here -- strtol() needs additional defenses [1]
(before casting to int):

postgres=# select jsonb_set('[1, 2, 3, 4,
5,6,7,8,9,10,11,12,13,14,15,16,17,18]',
'{"9223372036854775806"}'::text[], '"Input unsanitized"', false) ;
jsonb_set
----------------------------------------------------------------------------------
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, "Input
unsanitized", 18]
(1 row)

[1] https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer

I attach a fix for this bug. The commit message explains everything.

--
Peter Geoghegan

Attachments:

0001-Fix-path-infrastructure-bug-affecting-jsonb_set.patchtext/x-patch; charset=US-ASCII; name=0001-Fix-path-infrastructure-bug-affecting-jsonb_set.patchDownload

From 2f2042d93d00f85e52612bd7d7499c3238579d4d Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <peter.geoghegan86@gmail.com>
Date: Fri, 12 Jun 2015 14:55:32 -0700
Subject: [PATCH 1/2] Fix "path" infrastructure bug affecting jsonb_set()

jsonb_set() and other clients of the setPathArray() utility function
could get spurious results when an array integer subscript is provided
that is not within the range of int.

To fix, ensure that the value returned by strtol() within setPathArray()
is within the range of int;  when it isn't, assume an invalid input in
line with existing, similar cases.  The path-orientated operators that
appeared in PostgreSQL 9.3 and 9.4 do not call setPathArray(), and
already independently take this precaution, so no change there.
---
 src/backend/utils/adt/jsonfuncs.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index c14d3f7..13d5b7a 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -3814,11 +3814,14 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls,
 	if (level < path_len && !path_nulls[level])
 	{
 		char	   *c = VARDATA_ANY(path_elems[level]);
+		long		lindex;
 
 		errno = 0;
-		idx = (int) strtol(c, &badp, 10);
-		if (errno != 0 || badp == c)
+		lindex = strtol(c, &badp, 10);
+		if (errno != 0 || badp == c || lindex > INT_MAX || lindex < INT_MIN)
 			idx = nelems;
+		else
+			idx = lindex;
 	}
 	else
 		idx = nelems;
-- 
1.9.1

#33

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Peter Geoghegan (#32)

Re: Further issues with jsonb semantics, documentation

On 06/12/2015 06:16 PM, Peter Geoghegan wrote:

On Thu, Jun 4, 2015 at 5:43 PM, Peter Geoghegan <pg@heroku.com> wrote:

BTW, there is a bug here -- strtol() needs additional defenses [1]
(before casting to int):

postgres=# select jsonb_set('[1, 2, 3, 4,
5,6,7,8,9,10,11,12,13,14,15,16,17,18]',
'{"9223372036854775806"}'::text[], '"Input unsanitized"', false) ;
jsonb_set
----------------------------------------------------------------------------------
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, "Input
unsanitized", 18]
(1 row)

[1] https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer

I attach a fix for this bug. The commit message explains everything.

OK, pushed, although you'd have to be trying really hard to break this.
Still, it's reasonable to defend against.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Andrew Dunstan (#33)

Re: Further issues with jsonb semantics, documentation

On Fri, Jun 12, 2015 at 4:31 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

OK, pushed, although you'd have to be trying really hard to break this.
Still, it's reasonable to defend against.

I was trying really hard. :-)

Thanks
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35

Jim Nasby

Jim.Nasby@BlueTreble.com

over 10 years ago

In reply to: Alvaro Herrera (#21)

Re: Further issues with jsonb semantics, documentation

On 6/5/15 3:51 PM, Alvaro Herrera wrote:

Jim Nasby wrote:

On 6/5/15 2:08 PM, Petr Jelinek wrote:

That's a good point, and it won't get any better if/when we add the json
point support in 9.6 since the syntax would be something like select
jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '/c/a'; and we will again
silently do nothing. That's going to cause bugs in applications using this.

Yeah, this is a miniature version of the pain I've felt with variant: trying
to get sane casting for a data type that encompasses other types in current
Postgres is essentially impossible.

I'm not sure this is the same problem. But anyway I think requiring
explicit casts in this stuff is a good thing -- relying on implicit
cast to text, when most useful behavior uses other types, seems bad.

I'm not sure I'm following, at least for jsonb. If there's only jsonb -
json_pointer operator, why shouldn't we be able to resolve it? I suspect
part of the answer to that problem is that we need to make the
resolution of unknown smarter, or perhaps somehow configurable.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Andrew Dunstan (#27)

2 attachment(s)

Re: Further issues with jsonb semantics, documentation

On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Please submit a patch to adjust the treatment of negative integers in the
old functions to be consistent with their treatment in the new functions.
i.e. in the range [-n,-1] they should refer to the corresponding element
counting from the right.

This patch is attached, along with a separate patch which adds a
release note compatibility item.

See commit message for full details.

Thanks
--
Peter Geoghegan

Attachments:

0002-Release-note-compatibility-item.patchtext/x-patch; charset=US-ASCII; name=0002-Release-note-compatibility-item.patchDownload

From 03372a3611be4a8acbf79024aa3626f5ee4ac369 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <peter.geoghegan86@gmail.com>
Date: Mon, 22 Jun 2015 16:00:54 -0700
Subject: [PATCH 2/2] Release note compatibility item

Note that json and jsonb extraction operators no longer consider a
negative subscript to be invalid.
---
 doc/src/sgml/release-9.5.sgml | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml
index 4e2ea45..cbe20ca 100644
--- a/doc/src/sgml/release-9.5.sgml
+++ b/doc/src/sgml/release-9.5.sgml
@@ -125,6 +125,17 @@
      </para>
     </listitem>
 
+    <listitem>
+     <para>
+      Allow <type>json</> and <type>jsonb</> extraction operators to
+      accept negative subscripts, which count from the end of JSON
+      arrays.  Historically, these operators yielded <literal>NULL</>
+      in the event of a negative subscript, because negative
+      subscripts were considered invalid.  (Peter Geoghegan, Andrew
+      Dunstan)
+     </para>
+    </listitem>
+
    </itemizedlist>
 
   </sect2>
-- 
1.9.1

0001-Support-JSON-negative-array-subscripts-everywhere.patchtext/x-patch; charset=US-ASCII; name=0001-Support-JSON-negative-array-subscripts-everywhere.patchDownload

From f65b2af71b103b2b5802ac20771e59a285bb4a7f Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <peter.geoghegan86@gmail.com>
Date: Sat, 6 Jun 2015 20:07:46 -0700
Subject: [PATCH 1/2] Support JSON negative array subscripts everywhere

Previously, there was an inconsistency across json/jsonb operators that
operate on datums containing JSON arrays -- only some operators
supported negative array count-from-the-end subscripting.  Specifically,
only a new-to-9.5 jsonb deletion operator had support (the new "jsonb -
integer" operator).  This inconsistency seemed likely to be
counter-intuitive to users.  To fix, allow all places where the user can
supply an integer subscript to accept a negative subscript value,
including path-orientated operators and functions, as well as other
extraction operators.  This will need to be called out as an
incompatibility in the 9.5 release notes, since it's possible that users
are relying on certain established extraction operators changed here
yielding NULL in the event of a negative subscript.

For the json type, this requires adding a way of cheaply getting the
total JSON array element count ahead of time when parsing arrays with a
negative subscript involved, necessitating an ad-hoc lex and parse.
This is followed by a "conversion" from a negative subscript to its
equivalent positive-wise value using the count.  From there on, it's as
if a positive-wise value was originally provided.

Note that there is still a minor inconsistency here across jsonb
deletion operators.  Unlike the aforementioned new "-" deletion operator
that accepts an integer on its right hand side, the new "#-" path
orientated deletion variant does not throw an error when it appears like
an array subscript (input that could be recognized by as an integer
literal) is being used on an object, which is wrong-headed.  The reason
for not being stricter is that it could be the case that an object pair
happens to have a key value that looks like an integer; in general,
these two possibilities are impossible to differentiate with rhs path
text[] argument elements.  However, we still don't allow the "#-"
path-orientated deletion operator to perform array-style subscripting.
Rather, we just return the original left operand value in the event of a
negative subscript (which seems analogous to how the established
"jsonb/json #> text[]" path-orientated operator may yield NULL in the
event of an invalid subscript).

In passing, make SetArrayPath() stricter about not accepting cases where
there is trailing non-numeric garbage bytes rather than a clean NUL
byte.  This means, for example, that strings like "10e10" are now not
accepted as an array subscript of 10 by some new-to-9.5 path-orientated
jsonb operators (e.g. the new #- operator).  Finally, remove dead code
for jsonb subscript deletion; arguably, this should have been done in
commit b81c7b409.
---
 doc/src/sgml/func.sgml                |  16 ++++--
 src/backend/utils/adt/json.c          |  39 +++++++++++++
 src/backend/utils/adt/jsonfuncs.c     | 100 +++++++++++++++++++++++++---------
 src/include/utils/jsonapi.h           |   7 +++
 src/test/regress/expected/json.out    |  14 +++++
 src/test/regress/expected/json_1.out  |  14 +++++
 src/test/regress/expected/jsonb.out   |  30 ++++++++++
 src/test/regress/expected/jsonb_1.out |  30 ++++++++++
 src/test/regress/sql/json.sql         |   5 ++
 src/test/regress/sql/jsonb.sql        |   5 ++
 10 files changed, 231 insertions(+), 29 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 650051b..aa3e989 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10177,7 +10177,8 @@ table2-mapping
        <row>
         <entry><literal>-&gt;</literal></entry>
         <entry><type>int</type></entry>
-        <entry>Get JSON array element (indexed from zero)</entry>
+        <entry>Get JSON array element (indexed from zero, negative
+        integers count from the end)</entry>
         <entry><literal>'[{"a":"foo"},{"b":"bar"},{"c":"baz"}]'::json-&gt;2</literal></entry>
         <entry><literal>{"c":"baz"}</literal></entry>
        </row>
@@ -10230,7 +10231,10 @@ table2-mapping
     returning <type>text</>, which coerce the value to text.
     The field/element/path extraction operators return NULL, rather than
     failing, if the JSON input does not have the right structure to match
-    the request; for example if no such element exists.
+    the request; for example if no such element exists.  The
+    field/element/path extraction operators that accept integer JSON
+    array subscripts all support negative subscripting from the end of
+    arrays.
    </para>
   </note>
   <para>
@@ -10318,7 +10322,8 @@ table2-mapping
        <row>
         <entry><literal>#-</literal></entry>
         <entry><type>text[]</type></entry>
-        <entry>Delete the field or element with specified path</entry>
+        <entry>Delete the field or element with specified path (for
+        JSON arrays, negative integers count from the end)</entry>
         <entry><literal>'["a", {"b":1}]'::jsonb #- '{1,b}'</literal></entry>
        </row>
       </tbody>
@@ -10852,6 +10857,9 @@ table2-mapping
          <replaceable>create_missing</replaceable> is true ( default is
          <literal>true</>) and the item
          designated by <replaceable>path</replaceable> does not exist.
+         As with the path orientated operators, negative integers that
+         appear in <replaceable>path</replaceable> count from the end
+         of JSON arrays.
        </entry>
        <entry><para><literal>jsonb_set('[{"f1":1,"f2":null},2,null,3]', '{0,f1}','[2,3,4]', false)</literal>
          </para><para><literal>jsonb_set('[{"f1":1,"f2":null},2]', '{0,f3}','[2,3,4]')</literal>
@@ -10866,7 +10874,7 @@ table2-mapping
        <entry><para><type>text</type></para></entry>
        <entry>
          Returns <replaceable>from_json</replaceable>
-         as indented json text.
+         as indented JSON text.
        </entry>
        <entry><literal>jsonb_pretty('[{"f1":1,"f2":null},2,null,3]')</literal></entry>
        <entry>
diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index 26d3843..8d04347 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -341,6 +341,45 @@ pg_parse_json(JsonLexContext *lex, JsonSemAction *sem)
 }
 
 /*
+ * json_count_array_elements
+ *
+ * Returns number of array elements in lex context at start of array token
+ * until end of array token at same nesting level.
+ *
+ * Designed to be called from array_start routines.
+ */
+int
+json_count_array_elements(JsonLexContext *lex)
+{
+	JsonLexContext	copylex;
+	int				count;
+
+	/*
+	 * It's safe to do this with a shallow copy because the lexical routines
+	 * don't scribble on the input. They do scribble on the other pointers etc,
+	 * so doing this with a copy makes that safe.
+	 */
+	memcpy(&copylex, lex, sizeof(JsonLexContext));
+	copylex.strval = NULL; /* not interested in values here */
+	copylex.lex_level++;
+
+	count = 0;
+	lex_expect(JSON_PARSE_ARRAY_START, &copylex, JSON_TOKEN_ARRAY_START);
+	if (lex_peek(&copylex) != JSON_TOKEN_ARRAY_END)
+	{
+		do
+		{
+			count++;
+			parse_array_element(&copylex, &nullSemAction);
+		}
+		while (lex_accept(&copylex, JSON_TOKEN_COMMA, NULL));
+	}
+	lex_expect(JSON_PARSE_ARRAY_NEXT, &copylex, JSON_TOKEN_ARRAY_END);
+
+	return count;
+}
+
+/*
  *	Recursive Descent parse routines. There is one for each structural
  *	element in a json document:
  *	  - scalar (string, number, true, false, null)
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index 13d5b7a..424280b 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -597,6 +597,17 @@ jsonb_array_element(PG_FUNCTION_ARGS)
 	if (!JB_ROOT_IS_ARRAY(jb))
 		PG_RETURN_NULL();
 
+	/* Handle negative subscript */
+	if (element < 0)
+	{
+		uint32	nelements = JB_ROOT_COUNT(jb);
+
+		if (-element > nelements)
+			PG_RETURN_NULL();
+		else
+			element += nelements;
+	}
+
 	v = getIthJsonbValueFromContainer(&jb->root, element);
 	if (v != NULL)
 		PG_RETURN_JSONB(JsonbValueToJsonb(v));
@@ -629,6 +640,17 @@ jsonb_array_element_text(PG_FUNCTION_ARGS)
 	if (!JB_ROOT_IS_ARRAY(jb))
 		PG_RETURN_NULL();
 
+	/* Handle negative subscript */
+	if (element < 0)
+	{
+		uint32	nelements = JB_ROOT_COUNT(jb);
+
+		if (-element > nelements)
+			PG_RETURN_NULL();
+		else
+			element += nelements;
+	}
+
 	v = getIthJsonbValueFromContainer(&jb->root, element);
 	if (v != NULL)
 	{
@@ -719,7 +741,7 @@ get_path_all(FunctionCallInfo fcinfo, bool as_text)
 		/*
 		 * we have no idea at this stage what structure the document is so
 		 * just convert anything in the path that we can to an integer and set
-		 * all the other integers to -1 which will never match.
+		 * all the other integers to INT_MIN which will never match.
 		 */
 		if (*tpath[i] != '\0')
 		{
@@ -728,13 +750,13 @@ get_path_all(FunctionCallInfo fcinfo, bool as_text)
 
 			errno = 0;
 			ind = strtol(tpath[i], &endptr, 10);
-			if (*endptr == '\0' && errno == 0 && ind <= INT_MAX && ind >= 0)
+			if (*endptr == '\0' && errno == 0 && ind <= INT_MAX && ind >= INT_MIN)
 				ipath[i] = (int) ind;
 			else
-				ipath[i] = -1;
+				ipath[i] = INT_MIN;
 		}
 		else
-			ipath[i] = -1;
+			ipath[i] = INT_MIN;
 	}
 
 	result = get_worker(json, tpath, ipath, npath, as_text);
@@ -752,14 +774,15 @@ get_path_all(FunctionCallInfo fcinfo, bool as_text)
  *
  * json: JSON object (in text form)
  * tpath[]: field name(s) to extract
- * ipath[]: array index(es) (zero-based) to extract
+ * ipath[]: array index(es) (zero-based) to extract, accepts negatives
  * npath: length of tpath[] and/or ipath[]
  * normalize_results: true to de-escape string and null scalars
  *
  * tpath can be NULL, or any one tpath[] entry can be NULL, if an object
  * field is not to be matched at that nesting level.  Similarly, ipath can
- * be NULL, or any one ipath[] entry can be -1, if an array element is not
- * to be matched at that nesting level.
+ * be NULL, or any one ipath[] entry can be INT_MIN if an array element is
+ * not to be matched at that nesting level (a json datum should never be
+ * large enough to have -INT_MIN elements due to MaxAllocSize restriction).
  */
 static text *
 get_worker(text *json,
@@ -964,6 +987,17 @@ get_array_start(void *state)
 		 */
 		_state->result_start = _state->lex->token_start;
 	}
+
+	/* INT_MIN value is reserved to represent invalid subscript */
+	if (_state->path_indexes[lex_level] < 0 &&
+		_state->path_indexes[lex_level] != INT_MIN)
+	{
+		/* Negative subscript -- convert to positive-wise subscript */
+		int		nelements = json_count_array_elements(_state->lex);
+
+		if (-_state->path_indexes[lex_level] <= nelements)
+			_state->path_indexes[lex_level] += nelements;
+	}
 }
 
 static void
@@ -1209,9 +1243,30 @@ get_jsonb_path_all(FunctionCallInfo fcinfo, bool as_text)
 			errno = 0;
 			lindex = strtol(indextext, &endptr, 10);
 			if (endptr == indextext || *endptr != '\0' || errno != 0 ||
-				lindex > INT_MAX || lindex < 0)
+				lindex > INT_MAX || lindex < INT_MIN)
 				PG_RETURN_NULL();
-			index = (uint32) lindex;
+
+			if (lindex >= 0)
+			{
+				index = (uint32) lindex;
+			}
+			else
+			{
+				/* Handle negative subscript */
+				uint32		nelements;
+
+				/* Container must be array, but make sure */
+				if ((container->header & JB_FARRAY) == 0)
+					elog(ERROR, "not a jsonb array");
+
+				nelements = container->header & JB_CMASK;
+
+				if (-lindex > nelements)
+					PG_RETURN_NULL();
+				else
+					index = nelements + lindex;
+			}
+
 			jbvp = getIthJsonbValueFromContainer(container, index);
 		}
 		else
@@ -3411,10 +3466,8 @@ jsonb_delete_idx(PG_FUNCTION_ARGS)
 	it = JsonbIteratorInit(&in->root);
 
 	r = JsonbIteratorNext(&it, &v, false);
-	if (r == WJB_BEGIN_ARRAY)
-		n = v.val.array.nElems;
-	else
-		n = v.val.object.nPairs;
+	Assert (r == WJB_BEGIN_ARRAY);
+	n = v.val.array.nElems;
 
 	if (idx < 0)
 	{
@@ -3431,14 +3484,10 @@ jsonb_delete_idx(PG_FUNCTION_ARGS)
 
 	while ((r = JsonbIteratorNext(&it, &v, true)) != 0)
 	{
-		if (r == WJB_ELEM || r == WJB_KEY)
+		if (r == WJB_ELEM)
 		{
 			if (i++ == idx)
-			{
-				if (r == WJB_KEY)
-					JsonbIteratorNext(&it, &v, true);	/* skip value */
 				continue;
-			}
 		}
 
 		res = pushJsonbValue(&state, r, r < WJB_BEGIN_ARRAY ? &v : NULL);
@@ -3657,7 +3706,7 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2,
  * If newval is null, the element is to be removed.
  *
  * If create is true, we create the new value if the key or array index
- * does not exist. All path elemnts before the last must already exist
+ * does not exist. All path elements before the last must already exist
  * whether or not create is true, or nothing is done.
  */
 static JsonbValue *
@@ -3818,7 +3867,8 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls,
 
 		errno = 0;
 		lindex = strtol(c, &badp, 10);
-		if (errno != 0 || badp == c || lindex > INT_MAX || lindex < INT_MIN)
+		if (errno != 0 || badp == c || *badp != '\0' || lindex > INT_MAX ||
+			lindex < INT_MIN)
 			idx = nelems;
 		else
 			idx = lindex;
@@ -3829,7 +3879,7 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls,
 	if (idx < 0)
 	{
 		if (-idx > nelems)
-			idx = -1;
+			idx = INT_MIN;
 		else
 			idx = nelems + idx;
 	}
@@ -3838,12 +3888,12 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls,
 		idx = nelems;
 
 	/*
-	 * if we're creating, and idx == -1, we prepend the new value to the array
-	 * also if the array is empty - in which case we don't really care what
-	 * the idx value is
+	 * if we're creating, and idx == INT_MIN, we prepend the new value to the
+	 * array also if the array is empty - in which case we don't really care
+	 * what the idx value is
 	 */
 
-	if ((idx == -1 || nelems == 0) && create && (level == path_len - 1))
+	if ((idx == INT_MIN || nelems == 0) && create && (level == path_len - 1))
 	{
 		Assert(newval != NULL);
 		addJsonbToParseState(st, newval);
diff --git a/src/include/utils/jsonapi.h b/src/include/utils/jsonapi.h
index 296d20a..55cfb79 100644
--- a/src/include/utils/jsonapi.h
+++ b/src/include/utils/jsonapi.h
@@ -104,6 +104,13 @@ typedef struct JsonSemAction
 extern void pg_parse_json(JsonLexContext *lex, JsonSemAction *sem);
 
 /*
+ * json_count_array_elements performs a fast secondary parse to determine the
+ * number of elements in passed array lex context. It should be called from an
+ * array_start action.
+ */
+extern int json_count_array_elements(JsonLexContext *lex);
+
+/*
  * constructors for JsonLexContext, with or without strval element.
  * If supplied, the strval element will contain a de-escaped version of
  * the lexeme. However, doing this imposes a performance penalty, so
diff --git a/src/test/regress/expected/json.out b/src/test/regress/expected/json.out
index 3942c3b..43ca67d 100644
--- a/src/test/regress/expected/json.out
+++ b/src/test/regress/expected/json.out
@@ -569,6 +569,14 @@ WHERE json_type = 'array';
  "two"
 (1 row)
 
+SELECT test_json -> -1
+FROM test_json
+WHERE json_type = 'array';
+ ?column? 
+----------
+ {"f1":9}
+(1 row)
+
 SELECT test_json -> 2
 FROM test_json
 WHERE json_type = 'object';
@@ -698,6 +706,12 @@ select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> 1;
  
 (1 row)
 
+select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> -1;
+ ?column? 
+----------
+ 
+(1 row)
+
 select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> 'z';
  ?column? 
 ----------
diff --git a/src/test/regress/expected/json_1.out b/src/test/regress/expected/json_1.out
index 38f1526..155f414 100644
--- a/src/test/regress/expected/json_1.out
+++ b/src/test/regress/expected/json_1.out
@@ -569,6 +569,14 @@ WHERE json_type = 'array';
  "two"
 (1 row)
 
+SELECT test_json -> -1
+FROM test_json
+WHERE json_type = 'array';
+ ?column? 
+----------
+ {"f1":9}
+(1 row)
+
 SELECT test_json -> 2
 FROM test_json
 WHERE json_type = 'object';
@@ -698,6 +706,12 @@ select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> 1;
  
 (1 row)
 
+select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> -1;
+ ?column? 
+----------
+ 
+(1 row)
+
 select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> 'z';
  ?column? 
 ----------
diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out
index 1715202..c94225d 100644
--- a/src/test/regress/expected/jsonb.out
+++ b/src/test/regress/expected/jsonb.out
@@ -2590,6 +2590,18 @@ SELECT '["a","b","c",[1,2],null]'::jsonb -> 5;
 SELECT '["a","b","c",[1,2],null]'::jsonb -> -1;
  ?column? 
 ----------
+ null
+(1 row)
+
+SELECT '["a","b","c",[1,2],null]'::jsonb -> -5;
+ ?column? 
+----------
+ "a"
+(1 row)
+
+SELECT '["a","b","c",[1,2],null]'::jsonb -> -6;
+ ?column? 
+----------
  
 (1 row)
 
@@ -2639,6 +2651,18 @@ SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,3}';
 SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-1}';
  ?column? 
 ----------
+ 3
+(1 row)
+
+SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-3}';
+ ?column? 
+----------
+ 1
+(1 row)
+
+SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-4}';
+ ?column? 
+----------
  
 (1 row)
 
@@ -3121,6 +3145,12 @@ select '{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb #- '{
  {"a": 1, "b": [1], "c": {"1": 2}, "d": {"1": [2, 3]}, "n": null}
 (1 row)
 
+select '{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb #- '{b,-1e}'; -- invalid array subscript
+                              ?column?                               
+---------------------------------------------------------------------
+ {"a": 1, "b": [1, 2], "c": {"1": 2}, "d": {"1": [2, 3]}, "n": null}
+(1 row)
+
 select '{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb #- '{d,1,0}';
                              ?column?                             
 ------------------------------------------------------------------
diff --git a/src/test/regress/expected/jsonb_1.out b/src/test/regress/expected/jsonb_1.out
index 864d85c..ee6aef7 100644
--- a/src/test/regress/expected/jsonb_1.out
+++ b/src/test/regress/expected/jsonb_1.out
@@ -2590,6 +2590,18 @@ SELECT '["a","b","c",[1,2],null]'::jsonb -> 5;
 SELECT '["a","b","c",[1,2],null]'::jsonb -> -1;
  ?column? 
 ----------
+ null
+(1 row)
+
+SELECT '["a","b","c",[1,2],null]'::jsonb -> -5;
+ ?column? 
+----------
+ "a"
+(1 row)
+
+SELECT '["a","b","c",[1,2],null]'::jsonb -> -6;
+ ?column? 
+----------
  
 (1 row)
 
@@ -2639,6 +2651,18 @@ SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,3}';
 SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-1}';
  ?column? 
 ----------
+ 3
+(1 row)
+
+SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-3}';
+ ?column? 
+----------
+ 1
+(1 row)
+
+SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-4}';
+ ?column? 
+----------
  
 (1 row)
 
@@ -3121,6 +3145,12 @@ select '{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb #- '{
  {"a": 1, "b": [1], "c": {"1": 2}, "d": {"1": [2, 3]}, "n": null}
 (1 row)
 
+select '{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb #- '{b,-1e}'; -- invalid array subscript
+                              ?column?                               
+---------------------------------------------------------------------
+ {"a": 1, "b": [1, 2], "c": {"1": 2}, "d": {"1": [2, 3]}, "n": null}
+(1 row)
+
 select '{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb #- '{d,1,0}';
                              ?column?                             
 ------------------------------------------------------------------
diff --git a/src/test/regress/sql/json.sql b/src/test/regress/sql/json.sql
index 53832a0..8c3b73f 100644
--- a/src/test/regress/sql/json.sql
+++ b/src/test/regress/sql/json.sql
@@ -184,6 +184,10 @@ SELECT test_json -> 2
 FROM test_json
 WHERE json_type = 'array';
 
+SELECT test_json -> -1
+FROM test_json
+WHERE json_type = 'array';
+
 SELECT test_json -> 2
 FROM test_json
 WHERE json_type = 'object';
@@ -241,6 +245,7 @@ where json_type = 'array';
 select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> null::text;
 select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> null::int;
 select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> 1;
+select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> -1;
 select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> 'z';
 select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> '';
 select '[{"b": "c"}, {"b": "cc"}]'::json -> 1;
diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql
index a25a19d..f3a1282 100644
--- a/src/test/regress/sql/jsonb.sql
+++ b/src/test/regress/sql/jsonb.sql
@@ -642,6 +642,8 @@ SELECT '["a","b","c",[1,2],null]'::jsonb -> 3 -> 1;
 SELECT '["a","b","c",[1,2],null]'::jsonb -> 4;
 SELECT '["a","b","c",[1,2],null]'::jsonb -> 5;
 SELECT '["a","b","c",[1,2],null]'::jsonb -> -1;
+SELECT '["a","b","c",[1,2],null]'::jsonb -> -5;
+SELECT '["a","b","c",[1,2],null]'::jsonb -> -6;
 
 --nested path extraction
 SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{0}';
@@ -652,6 +654,8 @@ SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,1}';
 SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,2}';
 SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,3}';
 SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-1}';
+SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-3}';
+SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-4}';
 
 SELECT '[0,1,2,[3,4],{"5":"five"}]'::jsonb #> '{0}';
 SELECT '[0,1,2,[3,4],{"5":"five"}]'::jsonb #> '{3}';
@@ -757,6 +761,7 @@ select jsonb_delete_path('{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,
 
 select '{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb #- '{n}';
 select '{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb #- '{b,-1}';
+select '{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb #- '{b,-1e}'; -- invalid array subscript
 select '{"n":null, "a":1, "b":[1,2], "c":{"1":2}, "d":{"1":[2,3]}}'::jsonb #- '{d,1,0}';
 
 
-- 
1.9.1

#37

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Peter Geoghegan (#36)

Re: Further issues with jsonb semantics, documentation

On Mon, Jun 22, 2015 at 6:19 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Please submit a patch to adjust the treatment of negative integers in the
old functions to be consistent with their treatment in the new functions.
i.e. in the range [-n,-1] they should refer to the corresponding element
counting from the right.

This patch is attached, along with a separate patch which adds a
release note compatibility item.

Where are we on this? This is currently a 9.5 release blocker.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Peter Geoghegan (#37)

Re: Further issues with jsonb semantics, documentation

On 07/09/2015 04:10 AM, Peter Geoghegan wrote:

On Mon, Jun 22, 2015 at 6:19 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Please submit a patch to adjust the treatment of negative integers in the
old functions to be consistent with their treatment in the new functions.
i.e. in the range [-n,-1] they should refer to the corresponding element
counting from the right.

This patch is attached, along with a separate patch which adds a
release note compatibility item.

Where are we on this? This is currently a 9.5 release blocker.

I am on vacation right now, but I might have some time tomorrow to deal
with it. If not, it will be Sunday or Monday when I get to it.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#39

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Andrew Dunstan (#38)

Re: Further issues with jsonb semantics, documentation

On Thu, Jul 9, 2015 at 7:49 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I am on vacation right now, but I might have some time tomorrow to deal with
it. If not, it will be Sunday or Monday when I get to it.

Okay. Thanks.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#40

Robert Haas

robertmhaas@gmail.com

over 10 years ago

In reply to: Andrew Dunstan (#38)

Re: Further issues with jsonb semantics, documentation

On Thu, Jul 9, 2015 at 10:49 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Where are we on this? This is currently a 9.5 release blocker.

I am on vacation right now, but I might have some time tomorrow to deal with
it. If not, it will be Sunday or Monday when I get to it.

Is this still pending?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#41

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Robert Haas (#40)

Re: Further issues with jsonb semantics, documentation

On Fri, Jul 17, 2015 at 11:37 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I am on vacation right now, but I might have some time tomorrow to deal with
it. If not, it will be Sunday or Monday when I get to it.

Is this still pending?

Yes.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#42

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Robert Haas (#40)

Re: Further issues with jsonb semantics, documentation

On 07/17/2015 02:37 PM, Robert Haas wrote:

On Thu, Jul 9, 2015 at 10:49 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Where are we on this? This is currently a 9.5 release blocker.

I am on vacation right now, but I might have some time tomorrow to deal with
it. If not, it will be Sunday or Monday when I get to it.

Is this still pending?

Yes, been tied up a bit unexpectedly this week, am just getting down to
it right now.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Andrew Dunstan (#42)

Re: Further issues with jsonb semantics, documentation

On 07/17/2015 02:49 PM, Andrew Dunstan wrote:

On 07/17/2015 02:37 PM, Robert Haas wrote:

On Thu, Jul 9, 2015 at 10:49 AM, Andrew Dunstan <andrew@dunslane.net>
wrote:

Where are we on this? This is currently a 9.5 release blocker.

I am on vacation right now, but I might have some time tomorrow to
deal with
it. If not, it will be Sunday or Monday when I get to it.

Is this still pending?

Yes, been tied up a bit unexpectedly this week, am just getting down
to it right now.

OK, I have committed this and updated the open issues list on the wiki.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#44

Peter Geoghegan

pg@heroku.com

over 10 years ago

In reply to: Andrew Dunstan (#43)

Re: Further issues with jsonb semantics, documentation

On Fri, Jul 17, 2015 at 6:20 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

OK, I have committed this and updated the open issues list on the wiki.

Thanks, Andrew.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers